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PREFACE 


Geographic Information Systems (GIS) has emerged as a powerful tool for 
solving complex problems due to its capabilities to integrate, visualize, and 
analyze geographic data across domains and disciplines. An urgent need for 
balancing fundamental human needs with ecological capacity has compelled 
us to utilize geospatial technology for the sustainable future. This volume 
showcases GIS applications in transportation, water management, agriculture, 
seismology, archeology, and community development. Part I highlights 
different GIS techniques: network analysis to optimize solid waste collection; 
spatio-temporal analysis of phone-call records to explore human mobility; 
spatial interpolation and overlay to assess water quality. Part II illustrates GIS 
applications in planning for future and reconstructing the past: how GIS is 
used to aid asset-building processes for sustainable community development, 
enhance the transport planner’s toolbox through visualization of transportation 
activities, and determine likely land pathways to transport megaliths. Part II 
exhibits integration of geospatial technology (GIS, GPS, remote sensing) for 
environmental monitoring and sustainable management of natural resources. 
Geospatial technologies are coupled with modeling to automate the pesticide 
spraying process and forecast pests; analyze the spatial patterns of seismicity; 
identify flood prone areas and assess groundwater potential. This volume 
provides a snapshot of current developments in the geospatial fields, and will 
be valuable to academics and practitioners in geography, planning, and 
sustainability science. 
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1. Application of Vehicle Routing Problem for Sustainable Waste 
Collection: Case Study of Altoona, Pennsylvania 

2. Spatio-Temporal Visualization and Analysis Techniques in GIS 

3. Role of Geographic Information System for Water Quality Evaluation 


II. Applications 

4. Conceptual Framework for using GIS in Building Community Capital 
towards Sustainability 

5. GIS Applications in Practice: Exploring Spatial Dynamic of Transport 
Activities 

6. Establishing Megalith Transport Routes Using Geographical 
Information System 


III. Technologies 

7. GIS Applications in Modern Crop Protection 

8. Geoinformation Systems for Studying Seismicity and Impact 
Cratering Using Remote Sensing Data 

9. Flood Risks Analysis in a Littoral African City: Using Geographic 
Information System 

10. Use of Remote Sensing and GIS for Groundwater Potential Mapping 
in Crystalline Basement Rock (Sabodala Mining Region, Senegal) 


Chapter 1 - The management of solid waste in the City of Altoona, 
Pennsylvania, USA is unique in that a department responsible for the design 
and collection of solid waste is non-existent. Further, the city does not contract 
any particular company for collection. Rather, the city utilizes a freedom to 
choose system where residents can choose from any one of twenty companies 
for their solid waste collection. The Intermunicipal Relations Committee (IRC) 
is the local organization responsible for overseeing and enforcing waste and 
recycling regulations within the city. The freedom to choose system is highly 
inefficient. The sheer number of companies operating within the city makes it 
difficult for the IRC to enforce regulations as each company’s customers are 
scattered throughout the city. On any particular day, several collection trucks 
could be driving through the same neighborhood. This result in lengthy 
collection times and unnecessary miles traveled. This research utilizes the GIS 
spatial analyst vehicle routing problem (VRP) function to model the current 
freedom to choose collection system and determine total collection times, 
miles traveled, and number of trips to the transfer station. Two alternative 
collection scenarios are proposed and modeled. Results indicate the 
inefficiencies associated with current collection when compared to two 
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alternate scenarios. A controlled collection scenario reduces miles traveled by 
70 percent and collection time by 44 percent. Greater savings of 76 and 50 
percent occur with the improved efficiency scenario. Results confirm the 
wasteful miles traveled and man hours worked, thus demonstrating the need 
for city officials to implement changes that would bring savings to collection 
companies, customers, and the environment. 

Chapter 2 - Sustainability—balancing fundamental human needs with 
ecological resilience—has been embraced as an overarching policy goal. And 
communities have been called to participate in the process of attaining that 
ideal. Community-based organizations (CBOs) can benefit from using GIS in 
building community assets and developing sustainability initiatives. However, 
GIS, has not been used widely for these purposes in CBOs yet. In this chapter, 
the author illustrates how geographic information (such as maps) can be useful 
in community development drawing from community GIS projects, and 
explain how theories of sustainability and spatial thinking can be utilized in 
community-based efforts towards sustainability. CBOs can monitor and assess 
community sustainability by (a) organizing relevant indicators into the capital 
framework (theories of sustainability), and (b) exploring spatial distribution, 
interactions, relationships, and changes in sustainability-related issues using 
GIS (spatial thinking). The framework presented here can be applied to 
promote effective use of geospatial tools for community sustainability. 

Chapter 3 - The spatio-temporal visualization and analysis techniques play 
important roles in geographic information systems and knowledge discovery. 
In this research, the authors introduce three spatio-temporal analytical 
techniques including spatio-temporal visualization, space-time kernel density 
estimation, and spatio-temporal autocorrelation-analysis for exploring human 
mobility and urban structure patterns. These spatio-temporal analytics can help 
to answer questions like: Where are the spatio-temporal hotspots of human 
activities? How to explore spatio-temporal patterns in three-dimension GIS? 
Experiments are conducted using a large scale of phone-call detailed records 
in urban space. 

Expeditions on the spatio-temporal functionality and techniques contribute 
to the GIScience community for the future development of Space-Time GIS; 
and more broadly, it has potential to be applied in other disciplines, e.g., 
environmental, urban and social sciences. 

Chapter 4 - Crop Protection is besides agricultural engineering, plant 
breeding and fertilization an indispensable part of modern agriculture. Still, 
even though not substitutable, there is a downside. The use of pesticides and 
therefore the application of chemicals into the environment bear substantial 
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risks for both nature as well as human beings. Geographical Information 
Systems (GIS) can help making crop protection more sustainable. This chapter 
describes two possible examples of how GIS is used by the Central Institute 
for Decision Support Systems (DSS) in Crop Protection (German acronym 
ZEPP) to support farmers in Germany with their pesticide applications. 

The first example describes a DSS that creates machine readable 
application maps using a web based GIS application. Application maps offer 
the possibility to automate the pesticide spraying process. The maps created by 
the DSS include legal buffer zones to water bodies and protected terrestric 
structures, e.g. hedges, where spraying of pesticides is prohibited. Provided 
that a tractor with Global Navigation Satellite System (GNSS) and a pesticide 
sprayer with section control is available, an automated application is possible. 
Once the sprayer moves into an area of the field that is a buffer zone, the 
respective section is switched of automatically. The DSS helps farmers to 
comply with legal rules and to prevent the contamination of the environment. 

The second example describes how GIS is used in pest forecast systems. 
With the help of GIS it is possible to obtain results with higher accuracy for 
disease and pest simulation models. Pest forecast systems developed by ZEPP 
use GIS to interpolate geographical factors like temperature and relative 
humidity such getting meteorological data for every km’ in Germany. The 
interpolated data and the parameter precipitation, taken by radar measured 
precipitation data are used as input for the simulation models. The output of 
these models is presented as spatial risk maps in which areas of maximum risk 
of the disease outbreak, infection pressure or pest appearances are displayed. 
The modern presentation methods of GIS lead to an easy interpretation and 
further-more promote the use of the system by farmers. 

Chapter 5 - The spatial pattern of urban transport activities has become a 
focus of recent academic enquiry and planning policy concerns. This is largely 
driven by the rapid urban growth and increased transport pressure in major 
international cities and the demand for improved transport infrastructure and 
services. This article focuses on the application of Geographical Information 
Systems (GIS) techniques in exploring geographic patterns of major urban 
transport activities at both urban and regional scales. The first part in this 
article develops GIS methods to analyse geographical pattern of commuting 
transport at a regional level. The methodology uses multiple spatial O-D 
transport data at regional geographical units and applies disaggregated spatial 
techniques to identify spatial patterns of commuting distance and traffic flow 
and the changes in these patterns over time. The second part of the article 
demonstrates the application of GIS techniques in exploring tempo-spatial 
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patterns of public bicycle trips in urban areas. A GIS technique called flow 
map is developed to explore tempo-spatial patterns of public bicycle under 
different calendar event and climatic conditions. The paper demonstrates how 
the results from the GIS techniques may form part of an evidence base for the 
transport planner with the potential to inform future development to enhance 
the efficiency, and how the application of GIS techniques will enhance the 
planner’s toolbox whilst responding the transport planning issues. 

Chapter 6 - Limited evidence has led to considerable debate about land 
routes and methods to move megaliths chosen for sculpture by prehistoric 
societies. The current research is investigating the question using Geographic 
Information System (GIS) to determine likely land pathways to transport 
megaliths over 100 kilometres in Mesoamerica by Preclassic Olmec society. 
Access was restricted and the terrain included floodplains, seasonal rivers and 
extensive swamps. Analyses were derived from digitised survey maps using 
slope gradient tools initially from ARCVIEW 3.2 and finally ARC1O. 
Although compatibility issues arose with this combination as the authors 
describe, tools of both versions provided a starting point between stones’ 
source from which to then define a pathway across the challenging terrain. 

Chapter 7 - This chapter presents a tool for studying natural disasters, such 
as seismic and impact events, using real data from Catalogs of earthquakes 
(EC) and Earth’s impact structures (EISC) [1]. It is ENDDB [2] (the Earth’s 
Natural Disasters DataBase), a new version of a geoinformation system (GIS). 
The algorithms implemented in ENDDB allow visualizing a selected part of a 
current catalog in a pseudo-3D background map. With its mathematical 
support, the ENDDB system can plot frequency dependences of magnitudes or 
sizes (crater diameters) of events from various samples, as well as other 
distributions of integrated parameters in time and space, or their relationships 
with one another. 

The expert earthquake database (EEDB). There existed an earlier GIS 
version, an EEDB system [3], which had a wide range of seismological 
applications. It was gradual transition from a conventional GIS (originally 
created by the authors) to a high-tech expert system updated by including 
successively various mathematical methods for earthquake data processing, 
new parameters of seismic regime, and advanced representation tools. The 
realized algorithms [4] allow the user to compute and visualize maps and 
diagrams of seismicity parameters (slope of magnitude-recurrence curves, 
seismic quiescence, earthquake density, etc.), to reveal clustering of events, 
and remove aftershocks. Modifications and versions of GIS-EEDB for 
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different geodynamic regions [5] are illustrated in the chapter with case studies 
of seismic anomalies. 

Visualization and analysis of EISC data. Applying the EEDB system 
software to EISC data [1] (in the new GIS system, called Earth’s Impact 
Structures Catalog (EISC) [6]) allows gaining insights into spatial patterns of 
impact structures. In addition, the shapes of craters are constrained using a 
shaded relief model based on NASA data arrays of SRTM (Shuttle Radar 
Topography Mission) and ASTER GDEM (Global Digital Elevation Model), 
and the technology of digital mapping. Thus typical elements of impact craters 
morphology have been systematized and can be used as indicators of the crater 
origin [7]. 

Gravity data and new applications of GIS ENDDB. The reliability of 
geomorphically expressed diagnostic indicators of crater shapes was checked 
against geophysical features revealed by gravity data, namely, the presence of 
tail-shaped negative gravity anomalies produced by large impact craters. By 
mapping gravity anomalies, using their shaded relief model and “Global 
marine gravity” data (V18.1), the authors can verify the gravity patterns 
associated with impact cratering and check their validity as tracers of bolide 
trajectories. Gravity data also have seismological implications and can be used 
to identify seismic blocks, lineaments, and other structures detectable with GIS 
ENDDB mathematical tools and thus to analyze the spatial patterns of 
seismicity. 

Chapter 8 - Water quality evaluation is an overall process of evaluating 
physical, chemical and biological nature of water in relation to natural quality, 
human effects and intended uses particularly uses which may affect human 
health and the health of the ecosystem itself. Interpretation of enormous water 
quality data in a convenient manner for visual inspection is an important but 
often underestimated or omitted step in a water quality evaluation program. 
Recently, need of modern approaches and tools for interpreting water quality 
is emphasized for efficient water quality management. Geographic 
Information System (GIS), with an ability of capturing, storing, analyzing, 
manipulating, retrieving and displaying spatial data, has emerged as a 
powerful tool for decision-making in several areas including environmental 
field. This chapter aims at highlighting the role of GIS in synthesising, 
compiling, presenting and interpreting chemical data of both surface and 
ground waters. Firstly, few relevant fundamental terms and process of water 
quality evaluation are defined and/or described. Thereafter, the chapter 
contains theoretical procedure for applying GIS to assess spatial change or 
variability in water quality by characterizing extent and patterns of 
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contamination. In general, a water quality monitoring network consists of a 
group of point locations with known chemical attributes of water. GIS helps 
converting the point values into areal information through spatial interpolation. 

Hence, an overview of spatial interpolation techniques is provided, 
together with the methodologies for employing geostatistical modelling 
(kriging) and inverse distance weighting techniques and for computing spatial 
statistics (mean, median, standard deviation and coefficient of variation). The 
major application of GIS in past groundwater studies has been for assessing 
groundwater vulnerability. 

Therefore, the concept of groundwater vulnerability along with its 
historical perspective is described and different GIS-based overlay and index 
methods used for groundwater vulnerability assessment are summarized. 
Methodologies for applying different GIS methods in evaluating the 
groundwater vulnerability are illustrated through flowcharts. 

The major tools for describing groundwater vulnerability in GIS 
framework include DRASTIC, modified DRASTIC, DRAMIC, GOD, AVI, 
SINTACS, EPIK, GLA, PI and COP. 

Furthermore, the development of GIS-based water quality index for 
evaluating water quality is discussed. Finally, combined use of GIS and 
multivariate statistical analysis techniques in delineating water quality zones is 
discussed. It is concluded that GIS is a promising geospatial tool which offers 
efficient framework for sustainable management of freshwater resources. 

Chapter 9 - Flood hazard has become the most frequent natural disaster 
and has provoked global responsiveness. Due to its devastating nature, many 
lives have been lost, natural ecology degraded, disrupted social and economic 
activities and destroyed properties worth billions of dollars. Such over- 
whelming effect is felt more in urban centers especially those located in 
coastal regions. Estimating flood risk was a complex multi-faceted problem 
due to the level and amount of knowledge in systematic disciplines such as 
geography, geomorphology, climatology, hydrology, hydraulic engineering 
and urban planning that need to be combined. 

Presently, this problem has been surmount with the introduction of 
geographic information system (GIS) which when properly integrated with 
remote sensing technique, has the capability to transform the manner and way 
of modeling flood risk and extracting spatial information to support decision 
making processes. 

The substantive objective of this chapter is to examine the formidability of 
using high resolution remote sensing data and GIS techniques to assess and 
identify flood prone areas before occurrence in Lagos State-large coastal city 
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of Nigeria. The GIS-based flood risk methodology so-developed for a littoral 
urban region proved to be helpful in extracting flood prone areas based on 
elevation from SRTM digital elevation model (DEM) and proximity to source 
of hazard. Such risk areas were then classified into magnitudes of potential 
risk and five classes were identified-very high, high, moderate, low and very 
low. This extracted flood mask was further used to estimate the proportion of 
agricultural land and urban land likely to be affected in the event of a flood 
episode. To support grass root policy, the areal calculation was disaggregated 
into local government area territory. Furthermore, land use/ land cover data of 
the study region was extracted from Landsat image using a supervised 
classification method based on maximum likelihood algorithm. The extracted 
potential flood risk masks were overlaid on the land cover data so as to assess 
the likely impact of flood on the various land uses-agricultural land use and 
urban land use. In the same way, the identified flood prone area masks were 
entered into Google earth engine for the purpose of quick visual impression 
and mapping the flood vulnerable areas by neighborhood and road 
infrastructures. The five-class vulnerability feature was then overlaid on the 
local administrative map data loaded with the projected population figure of 
the study region. From here the vulnerable population was estimated by Local 
Government Area (LGA). The results of the study show that GIS technique is 
a formidable tool for flood risk analysis, mitigation and pre-hazard planning. 
This can be seen from the series of thematic maps that were generated which 
were used to develop a large GIS-assisted database. It is evident that the 
database so-generated will facilitate flood risk management and provide an 
effective framework that will support policy formulation. 

Chapter 10 - Detection of favorable zones for sustainable groundwater 
supply in terrains underlying by crystalline rocks needs to integrate two 
approaches: Remote Sensing and GIS. Landsat ETM+ images are processed 
by using ERDAS IMAGINE 9.2 and ASTER images by ArcGIS 9.3. 
Parameters controlling groundwater accumulation such as: rainfall, 
lineaments, lithology, slopes, and drainage network are evaluated in terms of 5 
potential classes namely: very good potentials, good; moderate; low and very 
low potential and integrated in the GIS tool. The resulting map shows that 5% 
of the study area presents a very good groundwater potential and this part is 
mainly concentrated in the south part of the study area and on the MTZ, very 
low potentials constitute 13% of the area located essentially in the north part 
of the study area and particularly on granites. Combination of results generated 
by the GIS with NDVI and the borehole productivities shows that, generally 
good groundwater potentials are well correlated with high vegetal activities in 
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dry season except in areas affected by forest fires which are frequent in the 
region. High borehole productivities (11.5 -30m*/H) are observed in zone 
presenting high groundwater potential resulting from the GIS tool and very 
low borehole productivities (0.6-2.5m°/H) are shown in the north part of the 
study area corresponding to the low and very low groundwater potentials. 
Good, moderate and low groundwater potentials represent respectively 18, 33 
and 30% of the total investigated surface area. This integrated approach 
combining Use of RS, GIS and hydrodynamic parameters such as borehole 
productivities can contribute to improve Knowledge of groundwater resources 
investigation in context of hard rock aquifers of the south-eastern of Senegal 
and can reduce the high rates of failed wells. So, RS and GIS can be used as an 
efficient tools for assessing groundwater potential at large scale. 
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Chapter 1 


APPLICATION OF VEHICLE ROUTING 
PROBLEM FOR SUSTAINABLE WASTE 
COLLECTION: CASE STUDY OF ALTOONA, 
PENNSYLVANIA 


Timothy J. Dolney* 
The Pennsylvania State University, Altoona College 
Ivyside Park, Altoona, Pennsylvania, US 


ABSTRACT 


The management of solid waste in the City of Altoona, Pennsylvania, 
USA is unique in that a department responsible for the design and 
collection of solid waste is non-existent. Further, the city does not 
contract any particular company for collection. Rather, the city utilizes a 
freedom to choose system where residents can choose from any one of 
twenty companies for their solid waste collection. The Intermunicipal 
Relations Committee (IRC) is the local organization responsible for 
overseeing and enforcing waste and recycling regulations within the city. 
The freedom to choose system is highly inefficient. The sheer number of 
companies operating within the city makes it difficult for the IRC to 
enforce regulations as each company’s customers are scattered 
throughout the city. On any particular day, several collection trucks could 
be driving through the same neighborhood. This result in lengthy 
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collection times and unnecessary miles traveled. This research utilizes the 
GIS spatial analyst vehicle routing problem (VRP) function to model the 
current freedom to choose collection system and determine total 
collection times, miles traveled, and number of trips to the transfer 
station. Two alternative collection scenarios are proposed and modeled. 
Results indicate the inefficiencies associated with current collection when 
compared to two alternate scenarios. A controlled collection scenario 
reduces miles traveled by 70 percent and collection time by 44 percent. 
Greater savings of 76 and 50 percent occur with the improved efficiency 
scenario. Results confirm the wasteful miles traveled and man hours 
worked, thus demonstrating the need for city officials to implement 
changes that would bring savings to collection companies, customers, and 
the environment. 


Keywords: GIS; route optimization; vehicle routing problem; solid waste; 
refuse; sustainability 


INTRODUCTION 


The issue of sustainability has garnered a lot of attention in the past few 
years. This is primarily due to concerns about the unintended social, economic, 
and environmental consequences of rapid population growth, economic 
growth, and consumption of our natural resources (US EPA, 2012). 
Unsustainable solid waste disposal and collection is one area of concern that is 
pushing the environment towards potential risk. Solid waste collection is 
regularly performed by trucks with diesel engines that average 5.1 miles per 
gallon (mpg) and use 220 gallons of fuel per year during idling (Gaines et al., 
2006). The trucks emit several emissions to the environment that are 
proportional to both route time (including stops) and route distance. Routing 
strategies are central to minimizing emissions. A poorly designed urban solid 
waste collection system has enormous impact on labor, operational and 
transport costs, and on society in general due to road contamination and 
negative effects on public health and the environment (Arribas et al., 2010). 
Fortunately, geographic information system (GIS) technology exists to assist 
decision makers in improving their waste management strategies. 

GIS technology has been applied to many different areas of waste 
collection to assist in the design and execution of successful waste 
management systems. These include GIS to assist in the overall design of 
waste collection systems (Pandey et al., 2012; Kanchanabhan et al., 2011; 
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Arribas et al., 2010; Zheng and Pan, 2010; Karadimas and Loumos, 2008; 
Lopez et al., 2008; Salhofer et al., 2007; Sharholy et al., 2007;Ghose et al., 
2006) and siting of landfills (Yesilnacar et al., 2012; Nazari et al., 2012; 
Vasilijevic et al., 2012; Sumathi et al., 2008; Kontos et al., 2003; Leao et al., 
2001). The most notable application of GIS for waste collection is route 
optimization to improve routing strategies and reduce vehicle emissions. 

Jovicic et al. (2011) used ArcGIS network analyst functionality to 
estimate the potential for reducing fuel consumption and thus the emission of 
carbon dioxide (CO2) through the communal vehicles route optimization. 
Results indicated an approximate annual savings of 1,700 miles for one 
collection vehicle within the City of Kragujevac, Serbia. Further, the most 
fuel-economical route was extracted and compared with the original route, and 
with the routes extracted from criterions concerning the traffic time and 
shortest distance. According to available information for the City of 
Kragujevac and the results from this study, it was estimated that the total 
savings could be 20% in costs and the associated emissions. Bhambulkar 
(2010) also used ArcGIS network analyst to identify best routing for municipal 
solid waste that cannot be collected by standard waste collection trucks, due to 
size and other prohibitive obstacles in the municipality of Nagpur, India. 
Optimal routing was cost effective and less time consuming when compared 
with the existing route with a monthly savings of 14%. 

Tavares et al. (2009) used GIS 3D route modeling software for waste 
collection and transportation to optimize driving routes and minimize fuel 
consumption in the city of Praia, Cape Verde. Their model accounted for road 
inclination and vehicle weight. Using ArcGIS software, the most fuel 
economical route was calculated, yielding cost savings of 8-12% for fuel 
consumption even though the most economical route was 1.8% longer that the 
shortest route. Karamidas et al. (2008) performed research using GIS to 
optimize the number and position of the waste bins in the Municipality of 
Athens, Greece. The number of waste bins decreased from 162 to 112 (30% 
reduction). This reduction also presents great savings in energy usedduring 
waste collection. Apaydin and Gonullu (2008) applied a shortest path model to 
Trabzon City, Turkey in order to optimize solid waste collection and minimize 
emissions. The optimized route decreased the route distance and route time by 
24.6% and 44.3% for nine routes in 26 districts. Further, emissions of (CO2), 
nitrogen oxide (NOx), hydrocarbons (HC), carbon monoxide (CO), and 
particulate matter (PM) decreased 831.4, 12.8, 1.2, 0.4, and 0.7 grams per 
route. 
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Karadimas et al. (2007) employed the ant colony system algorithm for 
monitoring, simulating, testing, and optimizing costs for different scenarios of 
solid waste management systems. A GIS tool supported the municipal solid 
waste management system by using parameters such as waste bin locations, 
road network topology, and population density. Further research is required to 
test the proposed model on an extended area. Apaydin and Gonullu (2007) 
published results of their research in route optimization for solid waste 
collection in the city of Trabzon, Turkey. The city of Trabzon is as large as the 
city of Kragujevac, Serbia and has 185,000 inhabitants. For 39 districts in the 
city, a shortest path model was used in order to optimize solid waste collection 
and hauling processes, as minimum cost was aimed. The Route View ProTM 
software as an optimization tool was used for that purpose and the success was 
around 4—59% for distance and 14—65% for time. The total benefit was 24% in 
total costs or about 18.014 $ monthly. Lakshumi et al. (2006) presented the 
results of study for the city of Chennai, India which has a population of 4.5 
million. The aim was to determinate the optimal route for solid waste 
collection and to compare the cost of new optimized and present routes. The 
commercial software package ArcGIS was used with savings in length of 
41.5% in day shift and 44% in night shift for one particular route. 

One of the earliest applications of GIS for waste collection vehicle routing 
was performed by Chang et al. (1997). They applied a revised multi-objective 
programing model associated with GIS spatial analysis capabilities to analyze 
and visualize optimal paths, and allocate vehicles and labor within a waste 
collection network. They were able to analyze alternate solid waste collection 
strategies under different planning scenarios to evaluate different planning 
scenarios in the network to quickly gain a general understanding of the impact 
of policy changes in the waste collection system. Among the authors 
conclusions were that GIS will increasingly be relied on to support solid waste 
management issues in the future. 

Other studies have relied on other methodologies aside from GIS 
functionality. Kim et al. (2006) and Sahoo et al. (2005) each report on the 
development of a waste collection vehicle routing problem with time windows 
(VRPTW) algorithm intended to reduce the number of vehicles and total 
traveling time. The VRP is typically utilized in the waste collection industry to 
reduce the number of vehicles and total traveling time. The authors also 
considered route compactness since a solution with better route compactness 
has fewer crossovers among routes. Their algorithm was successfully 
implemented and deployed for real life commercial waste collection problems 
at Waste Management (WM), Inc., the leading provider of comprehensive 
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waste management services in North America. Specifically, the algorithm was 
embedded in WasteRoute (Sahoo et. al, 2005), a comprehensive enterprise- 
wide web-based route management application that takes into account WM’s 
specific routing considerations. WasteRoute was deployed across North 
America beginning March 2003 and immediately brought savings. The authors 
illustrated one example of savings in the city of Elgin, Illinois, USA. Before 
WasteRoute, the city required ten 9-hour routes with a productivity of 57.06 
yards/hour. By utilizing Waste Route, the city now requires nine 9-hour routes 
and increased its productivity to 63.40 yards/hour. The authors estimate that 
WM could reduce 984 routes over the subsequent year for a savings of $18US 
million. In the end, WasteRoute reduced WM’s operational costs by 
organizing routes to minimize overlap and thereby reduce the number of 
vehicles WM needed to serve its customers, and by sequencing stops along a 
route to make the best use of fuel, driver schedules, and disposal trips. 

Ghiani et al. (2005) conducted a study of solid waste collection in the 
municipality of Castrovillari, Italy in an attempt to reduce the amount of 
money allocated to waste collection. The focus was on residential collection 
routes that were manually formed by a supervisor. They modeled waste 
collection as an arc routing problem (ARP); designing a set of vehicle routes 
traversing a specified subset of required arcs and/or edges of a graph, with or 
without side constraints. The objective is to minimize the total distance 
traveled by the vehicles. GIS was unavailable to the city for the research 
resulting in the authors having to collect data about travel times on the road 
network and estimate solid waste daily levels for each street. The outcome was 
a computerized system developed in Visual C language that eliminated 
overtime and achieved an annual reduction of approximately 8 percent in total 
cost. More detailed costs were generated that corresponded to three collection 
vehicles: a reduction in distance traveled equal to 10.6% (i.e. 18 km/day) 
equivalent to a savings of 7000 Euro per year; a reduction in work time equal 
to 13.5% (i.e. 2 h 10 s), overtime elimination, equivalent to an additional 
saving of 5000 Euro per year; and a better workload distribution among 
vehicles. These improvements were primarily the result of better allocation of 
containers to vehicles. The authors feel that if applied by other Italian 
municipalities, this kind of analysis could save hundreds of millions of Euro 
every year. 

Simonetto and Borenstein (2007) developed a decision support system 
(DSS) called SCOLDSS to support the operational planning of solid waste 
collection. The system specifically supports the following tasks: reducing the 
amount of solid waste destined to the landfill; assuring a waste input 
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percentage at each sorting unit; assigning vehicles to collection trips; defining 
their route; and estimating the work capacity (productivity) of sorting units, in 
relation to the waste arrival and processing (separation). The system basically 
aids the solid waste collection operational management through the generation, 
analysis, and assessment of possible operational scenarios for this type of 
collection. The system was developed using the Borland Delphi environment 
and the commercial software Arena to carry out the simulations. Results from 
the system were validated using real data from the solid waste collection in the 
city of Porto Alegre, Brazil. By using the system, it is possible to obtain a 
mean reduction of 8.82% in the distance (43.8 km less) to be covered by the 
collection vehicles and a reduction of 17.89% in the weekly number of trips. 
The distances covered weekly would decrease 262.8 km, leading to an annual 
reduction estimated at 13,665 km. Concerning the number of trips, the current 
mean is 27.3 trips per day (163.8 per week). Using SCOLDSS, the average 
number of trips would be 134.9 weekly trips (reduction of 17.89%), which 
would result in an annual reduction of 1502 trips. These results outperform the 
current operation planning deployed. The authors feel the system has great 
potential as an effective support tool to be used in real world solid waste 
systems. 

El-Hamouz (2008) demonstrates how advantageous the application of a 
logistical management strategy can be used for rescheduling municipal solid 
waste collection systems, reallocating street solid waste containers, and 
minimizing vehicle routing. The author used real field data to test the methods 
in Tubas, West Bank. The application was tested by the private sector for 1 
month. The new system proved to be successful in terms of greater efficiency, 
coverage and quality of service. The total cost of collection was found to be 
US$21 per ton of solid waste. These reduced collection costs to a level that is 
socially acceptable (US$3.75/family/month) as well as economically and 
environmentally sound. 

The City of Altoona, Pennsylvania, USA and its solid waste collection 
system, or lack thereof, is a perfect candidate for a GIS-based solid waste 
collection strategy. The current collection system is highly inefficient with too 
many solid waste companies and too many miles traveled. This article presents 
the application of GIS for residential solid waste collection in the City of 
Altoona to demonstrate the system’s inefficiencies and also present alternate 
scenarios that improve the collection process. Information for this research 
specific to solid waste collection in the city was obtained from several sources 
including: the Executive Director of the Intermunicipal Relations Committee 
(IRC), the Director of the Blair County Department of Solid Waste and 
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Recycling (defunct as of 10/2012), Gannett Fleming, Inc. Technical Assistance 
Study, solid waste collection companies, and City of Altoona residents 
(including the author). The IRC is a council of governments (COG) consisting 
of the City of Altoona and other surrounding townships and boroughs. It was 
initially established as the Intermunicipal Recycling Committee in 1990 to 
address the needs of member municipalities related to recycling and 
composting required by Pennsylvania Act 101 of 1988. The name was 
changed in 1997 to reflect a desire by the member municipalities to undertake 
other intermunicipal issues. They are currently the only organization that 
enforces regulations related to solid waste collection in the city. 
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Figure 1. City of Altoona, Pennsylvania, USA. 
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2. STUDY AREA 


2.1. Solid Waste Collection 


The City of Altoona, Pennsylvania is one of twenty-four municipalities 
located in Blair County (Figure 1). It has a population of 46,320 with 19,473 
households spread across 9.91 square miles of land (US Census Bureau, 2012). 
The city’s inefficient solid waste collection problem emanates from a 
“freedom to choose” system where residents choose their own solid waste 
company because the city has no department of solid waste management nor 
do they contract a single company for waste collection. As a result, 20 
independent companies operate within the city collecting solid waste and 
recyclables. Their customer base is not concentrated in one particular area of 
the city but rather is scattered throughout the city. As a result, any particular 
neighborhood could have several separate companies collecting solid waste 
throughout the day. Such a system has a trickledown effect as the sheer 
number of companies leads to a number of environmental and policy issues. 
The only positive aspect of solid waste collection within the city is that 
residents are required to participate in curbside recycling of newspaper, other 
mixed paper, cardboard, glass, plastic bottles, aluminum cans, and steel cans. 
Companies working within the city are required to properly recycle these 
materials and adhere to ordinances set by the city and IRC. It’s worth noting 
that the only municipality in Blair County that contracts their solid waste 
collection to a single company is the borough of Tyrone. Not coincidentally, 
monthly costs in Tyrone are more than $8US cheaper per household than the 
average cost in the City of Altoona. 


2.2. Inefficient Solid Waste Collection 


In 2006, Gannett Fleming, Inc. performed a study that evaluated recycling 
and solid waste collection in the City of Altoona and surrounding townships. 
They concluded that the freedom to choose system as it existed was expensive, 
inefficient, and impossible to enforce (Gannett Fleming, 2006). Based on the 
report, Altoona City Council was prepared to contract a single company for 
solid waste collection within the city. This would reduce the cost per 
household to $14.00US (25-35 percent savings) when compared to the current 
$22.00US average fee per month paid by Altoona residents for private 
subscription service at that time. However, a small minority of Altoona 
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residents voiced opinions of their constitutional right to choose their own trash 
company. Altoona City Council promptly folded and residents are still free to 
choose with today’s costs generally $22.00US or more per household. The 
only positive outcome from this initiative was the division of the city into four 
separate areas, each with weekly uniform collection days (Figure 2). 
Collection days are named according to the main section of the city where 
collection occurs on that particular day. For instance, households that live in 
the Juniata section of Altoona have solid waste collection on Thursday. 
Subsequent figures presented in this article represent Thursday collection in 
the Juniata section of Altoona. Uniform collection days have improved the 
ability to monitor curbside collection, something the IRC or county office 
were never able to do before because recycling could be set-out on any of 
twelve days over a two week collection cycle. 


be Me 
54 


Monday - Eldorado (3,599) 


H ] Tuesday - Pleasant Valley (5,371) 
Aa Wednesday - Fairview/East End (6,201) 
Ea Thursday - Juniata (4,786) 


D Transfer Station 


Figure 2. City of Altoona Solid Waste Collection Areas. 
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Collection companies also realize how inefficient the collection system is 
but obviously remain quiet as their business benefits by getting a share of 
customers whether a large or small amount. This was evident when attempts 
were made to contact each of the 20 companies to gather general information 
about their collection strategies. Questions included but were not limited to: do 
you use any route logistics when collecting solid waste, what is the size of 
your solid waste hauler, and do you conduct 100 percent recycling? Many 
were reluctant to answer questions as they feel any information provided will 
lead to their demise. Few did provide insight into their operations. All 
company names were kept confidential for this research. 

While this system allows residents freedom to choose their solid waste 
company, it testifies to an unawareness or denial of the high cost of the current 
system, both environmentally and financially. First, several collection trucks 
frequenting the same neighborhoods creates unnecessary and additional 
vehicle emissions. This problem is exasperated by some companies using 
more than one vehicle, one truck for solid waste collection and one for 
recyclables. Thus, the number of collection trucks passing through a 
neighborhood on collection days could potentially be greater than 20. This 
process is even more inefficient when one considers that solid waste collection 
companies are generally clueless about routing strategies. Under the best 
circumstances they are driving a few blocks between stops. The average 
distance between stops in single-collector systems is measured in feet, but is 
usually measured in blocks in a system like Altoona’s. Routes also include 
trips to the transfer station once the trucks are filled. None of the companies 
interviewed for this study have scientifically approached the issue and 
systematically derived a route that minimizes mileage, intersection crossing, 
and traffic congestion. It’s been suggested that companies have little incentive 
to invest time and money into route logistics. Rather, if a route is inefficient 
resulting in higher gas costs, it is easier for the company to raise customer’s 
monthly bill. 

Second, the increased truck traffic also generates noise pollution 
throughout the day. Companies can begin collecting at 5AM and can continue 
working until 6PM. Trucks themselves are loud but the noise is manifested 
when workers use the hydraulic system to compact the solid waste. Although 
they can collect until 6PM, most try and finish before the transfer stations 
close at 3PM. Those that collect later in the day have no place to take their 
solid waste and it remains in trucks overnight, sometimes parked in areas not 
permitted. Being that some begin collecting at different times and locations, 
trucks continually pass through neighborhoods during the day. Increased truck 
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traffic can also lead to road degradation. Last, City of Altoona residents are 
paying more for the freedom to choose system than if the city had one or two 
contracted companies. 

Aside from the freedom to choose, two other benefits have been identified 
with this system. First, elderly residents have difficulty preparing their trash 
for curbside collection and the company they subscribed with for 25-plus years 
understands their need. Upon arriving at their residence, workers will walk to 
the back door to retrieve their trash and place it in the hauler. Some companies 
even advertise on their collection trucks the phrase “elderly friendly.” Many 
elderly fear they will lose this privilege if the city contracts a required 
company for all residential collection. However, if the city were to, the few 
companies interviewed for this research indicated they would be willing to 
assist elderly or handicapped customers. Special contractual provisions (for a 
small additional charge) could also be added for those that needed or desired 
backdoor pickup. Second, most companies do not have restrictions on the 
amount of solid waste residents can place out for pick-up. Many residents said 
they can place large amounts of garbage out for pick-up (seven or eight bags) 
and their company will collect it all. Companies do not want to place 
restrictions on amounts for fear they will lose business to other companies. 
Aside from the cost, residents are the primary beneficiary of the current 
system. Each company also benefits by getting a share of the customer base. 
Regardless, there are too many companies participating in solid waste 
collection and the benefits residents receive negatively impact the 
environment. The situation also demonstrates how local governments fail to 
use GIS as a planning tool. 

This research presents the use of GIS as an assessment and planning tool 
for a case study of solid waste collection in the City of Altoona. The city lends 
itself as a unique study area compared to others in the academic literature due 
to the sheer number of collection companies and the lack of systematic 
planning amongst them. Jovicic et al. (2011) states there is no universal 
solution for the optimization of solid waste management as each locale’s 
characteristics must be taken into consideration as unique. Utilizing the GIS 
network analyst function of Vehicle Routing Problem (VRP) and data obtained 
from the IRC, the purpose of this research is two-fold. First, GIS was used to 
model the current freedom to choose system to assess collection time, miles 
traveled, and number of trips to the transfer station. The design of the current 
collection systems likely results in wasted miles traveled and lengthy 
collection times. Therefore, the second part of this research used GIS to model 
alternate collection scenarios that potentially improve the freedom to choose 
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system and provide greater efficiency. Rather than adopt the approach taken 
by other studies and use VRP to only solve for the optimal routing strategy, 
this research differs by proposing and modeling an alternate scenario that 
utilizes several collection companies but remains efficient. 


3. DATA AND METHODS 


This research design utilized the Vehicle Routing Problem (VRP) function 
through ESRI’s ArcGIS network analyst. The VRP is a type of network 
analysis tool for routing a fleet of vehicles to service a set of orders with the 
goal of minimizing some objective (e.g., operating cost), while satisfying 
certain constraints. These constraints may include time windows, multiple 
route capacities, travel duration constraints, route zone and route seed point 
constraints, specialties constraints, and paired order constraints (ESRI GIS 
Dictionary - http://support.esri.com/ en/knowledgebase/GISDictionary/term/ 
vehicle%20routing%20problem). Several studies have utilized VRP within 
GIS for the purpose of assessing and designing improved waste collection 
strategies (Jovicic et al., 2011; Bhambulkar2010; Kim et al., 2006 Sahoo et al., 
2005). Further, ArcGIS is readily available and is the most popular GIS 
software package used within many state and local governments in the United 
States. Local governments could apply this research design to their solid waste 
collection strategies. VRP’s user-interface within ArcGIS is user-friendly and 
allows users to define a variety of inputs. Given these reasons and the 
availability of data from the IRC, VRP within ArcGIS was chosen as the 
methods to analyze waste collection in the City of Altoona. 

For this research, the solid waste collection companies represent the fleet 
of vehicles serving a set of orders comprised of residential households and 
their solid waste. Constraints include time windows for companies to begin 
and end collection, amount of solid waste per household, household collection 
time, capacity of solid waste collection trucks, and typical road constraints 
(speed limits, turn restrictions, traffic congestion) imposed by the road 
network. The VRP provides output that includes driving directions, total 
distance traveled, and total travel time. 

Applying the VRP to solid waste collection in the City of Altoona 
required several data and methodology considerations: 
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residential locations - represented through GIS polygon parcel data 
obtained from the Blair County Assessment Office. Structured query 
statements were used to select residential land use types to serve as 
household collection points. 

solid waste collection companies — the IRC provided an address list of 
the 20 companies and their locations were geocoded in ArcMap. 
transfer stations - The City of Altoona utilizes two transfer stations 
that are within 0.5 mile of each other. Addresses of each were 
obtained and geocoded using ArcMap. Their geographic center was 
calculated to serve as the city’s lone transfer station for this research. 
The transfer station illustrated in Figure 2 is the geographic center of 
the two primary transfer stations. 

streets— in order to model the street network as close to reality, 
StreetMap Premium for ArcGIS was obtained. It is an enhanced street 
dataset based on commercial street data from NAVTEQ and TomTom 
that works with ESRI's ArcGIS software to provide geocoding, 
routing, and high-quality display. The data allows one to generate the 
shortest or fastest distance, point-to-point, or multistop routes with 
driving directions. One-way and turn restriction information is taken 
into consideration to ensure the most accurate routes and directions. 
The data also includes historical traffic data that summarizes the 
average roadway travel speed for more accurate arrival time 
projections and avoidance of congestion based on day and time. 
amount of solid waste — the IRC provided the 2011 solid waste 
collection totals for each of the 20 companies operating within the city 
(Table 1). Based on these totals, the average amount of solid waste 
generated per household each week in the City of Altoona is 50.2 Ibs. 
This number was rounded to 50.0 Ibs per household. 

collection time — this research accounted for total collection time 
between S5AM-3PM. Variables include street travel time, time 
collecting and loading each household’s solid waste, and time at 
transfer station. Collection time begins at 5AM when the collection 
truck leaves company headquarters, not when it arrives at the first 
household. This is an unfortunate input in the VRP model that cannot 
be altered. Collection time ends at 3PM with the transfer station the 
last stop for each company regardless if the collection truck is full or 
not. This ensures the truck does not sit overnight with solid waste 
onboard. The Gannett Fleming, Inc. Technical Assistance Study 
(2006) found the average household stop within the city is 30 seconds. 
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This is the same amount of time used by Arribas et al. 
2010.(Anecdotal evidence would indicate that this collection time is 
actually much longer with some smaller companies due to trucks that 
do not compact waste efficiently or a single employee doing both 
driving and collecting of solid waste.) According to workers at the 
local transfer stations, the amount of time spent unloading a full truck 
varies between 5-15 minutes depending on the number of other trucks 
present. This research uses an average time of 10 minutes for each 
visit to the transfer station. 

e solid waste collection truck size —the size of each company’s solid 
waste collection truck varies from the smallest of a 2cubic yard loader 
to the largest of 25 cubic yards. Varying sizes of 11, 12, 14, and 20 
cubic yard loaders also exist. Because so few companies were willing 
to cooperate when contacted, the size of each hauler was visually 
estimated and placed into one of three size categories: 


v Mini —4 cubic yard loader = 1.0 tons (2,000 Ibs) of solid waste 

vy Mid- 12 cubic yard loader = 4.5 tons (9,000 lbs) of solid waste 

v Large — 25 cubic yard loader = 10.5 Tons (21,000 Ibs) of solid 
waste 


Based on solid waste hauler size and amount of solid waste per household, 
the approximate number of households each can collect prior to capacity is 40 
houses for a mini-hauler, 180 houses for a mid-hauler, and 420 houses for a 
large hauler. 


3.1. Vehicle Routing Problem 


These data and methods were used in conjunction with the VRP to assess 
the current collection system and also propose more efficient collection 
scenarios. Figure 3 illustrates of a layout of the VRP model as it relates to 
solid waste collection within the city. Even though some companies utilize 
more than one truck to collect solid waste, this research models one collection 
truck per company on each collection day. Three separate collection scenarios 
for the City of Altoona were modeled: current collection, controlled collection, 
and improved efficiency. The current collection scenario models all 20 
independent companies as they collect solid waste on each of the four 
collection days. 
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Table 1. Solid Waste Company Collection Statisticsfor Year 2011 
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1 Company A 4,841 24.3% 601.52 8,064.43 8,665.94 

2 Company B 396 2.0% 15.32 727.71 743.03 

3 Company C 440 2.2% 0.00 884.83 884.83 

4 Company D 2,794 14.0% 98.83 1,140.82 1,239.65 

5 Company E 1,100 5.5% 23.01 742.85 765.86 

6 Company F 44 0.2% 0.12 3.25 3.37 

7 Company G 66 0.3% 0.17 9.95 10.12 

8 Company H 352 1.8% 24.00 3,495.67 3,519.67 

9 Company I 528 2.6% 3.25 378.34 381.58 

10 | Company J 352 1.8% 46.04 2,055.44 2,101.48 

11 | Company K 968 4.9% 0.00 715.44 715.44 

12 _ | Company L 352 1.8% 0.00 634.51 634.51 

13 | Company M 968 4.9% 24.01 749.36 713.38 

14 | Company N 2,310 11.6% 51.48 1,976.20 2,027.68 

15_ | Company O 176 0.9% 4.00 38.10 42.10 

16 | Company P 440 2.2% 3.01 363.85 366.85 

17 | Company Q 1,496 7.5% 70.53 503.85 574.37 

18 | Company R 792 4.0% 10.00 285.44 295.44 

19 | Company S 1,320 6.6% 89.58 2,077.62 2,167.20 

20 | Company T 220 1.1% 6.50 170.57 177.07 

TOTAL 19,957 | 100.0% 1,071.36 | 25,018.22 26,089.58 


Even though each company’s customer list is not readily available, the 
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number of customers they serve within the city are recorded by the IRC (Table 
1). To allocate the correct number of customers to each solid waste collector, a 
“select random polygons” tool for use in ArcGIS’s ArcToobox was obtained 
from the North Carolina, United States Department of Agriculture (USDA). 
They developed the tool for use in soil survey updates and evaluations. 
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Figure 3. Layout of VRP Model and Inputs for Modeling Solid Waste Collection. 


Given a polygon data-set, the user can specify the number of random 
polygons to be selected. 

The tool was applied to the parcel data to assign residential locations 
(customers) on each of the four collection days to the 20 companies (Figure 4). 
The randomness of the tool mimics reality in that there is no systematic 
structure in residents choosing their solid waste collection company. All 
residential locations in each of the four collection areas were assigned to a 
particular company. No residential location is served by more than one 
company. 

The controlled collection scenario models six solid waste companies in 
each of the four collection areas. Residential locations serving as solid waste 
collection points are equally divided into six geographic clusters (Figure 5). 
The purpose of this scenario is to give local companies with the larger number 
of customers an equal share of customers. One of the concerns city officials 
have about contracting solid waste collection to only one or two companies is 
that approximately 18 companies would lose their customer base, thus hurting 
local businesses. As a compromise, this scenario utilizes six companies with 
residential locations clustered rather than spread across the city. Identifying 
clusters in GIS can be performed using the Cluster and Outlier Analysis or Hot 
Spot Analysis tools given a weighted variable for the data. 
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Households by Company 
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Figure 4. Distribution of Customers for Thursday, Juniata Collection — Scenario 1. 


However, this research is unable to use these tools for cluster 
identification as the lone weighted variable that can be used, amount of waste 
per household, is the same for all households (50.0 lbs per household. As a 
result, clusters were visually determined. Based on the current data available, 
an alternate method to identifying clusters could be based on the amount of 
waste to be collected and truck size. Large haulers would get a greater number 
of clustered households than mid-sized haulers. In the end, the purpose of this 
scenario is to demonstrate how clustering a company’s customer base can 
potentially reduce miles travelled and collection time. 

The improved efficiency scenario determines the maximum number of25 
cubic yard solid waste collection trucks needed to service each of the four 
collection days. A 25 cubic yard truck was selected to maximize the amount of 
solid waste that can be collected which ultimately reduces collection time and 
distance traveled. Rather than contract solid waste collection to one or several 
companies, officials want to explore the option of the city performing 
collection with their own trucks and city workers. This scenario would provide 
officials with an approximation of how many trucks are needed to service the 
city and the amount of time needed for collection. This scenario was modeled 
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with VRP using one 25 cubic yard truck serving all residential locations on a 
particular collection day in the 10-hour collection time frame. Once the time 
frame limit was reached, the number of residential locations serviced was 
determined and then removed from the data-set. 

The model was then run again for the remaining residential locations on 
that particular collection day. This process was repeated until all locations for 
each collection day were serviced, thus providing the total number of 25 cubic 
yards trucks needed to service each collection day in the required time frame. 
Output from the VRP model for all scenarios includes: driving directions, time 
collection begins and ends, total collection time (hours:minutes), total distance 
traveled (miles), and number of trips to the transfer station. 

Driving directions could greatly benefit solid waste collection companies 
by providing drivers with an optimal route serving residential locations that 
could reduce miles traveled and fuel consumption. 
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Figure 5. Distribution of Customers for Thursday (Juniata) Collection — Scenario 2. 
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4. RESULTS 


The City of Altoona has 19,957 residential locations that generate 
approximately 997,850 pounds of solid waste each week. Monday collection 
in Eldorado contains the fewest households with 3,599. Conversely, 
Wednesday collection in Fairview contains the greatest number with 6,201. 
This represents a 72 percent increase in households compared to Eldorado. 
Several companies with a large number of Fairview residential customers 
expressed frustration regarding the difficulty in completing all their stops prior 
to the 3PM deadline. This demonstrates the need for a more efficient 
collection system as driving several blocks between households and 
crisscrossing the city results in wasted work time. 

City-wide, Company A serves the largest number of households at24.3 
percent. Three companies (A, D, and N) serve half of all households. The 
remaining 50 percent is comprised of several companies with only 1 to 2 
percent of households and three companies (F, G, and O) serving less than one 
percent. This exemplifies the large amount of unnecessary waste collection 
traffic as several solid waste companies have few customers that could 
otherwise be served by companies with a larger number of customers. 


4.1. Scenario 1 


Table 2 presents results from modeling Scenario 1 — current collection — in 
the four collection areas. Monday collection represents the longest number of 
miles traveled (1,266.8 miles) of all four collection days. Conversely, Monday 
has the least amount of collection hours (64 hours and 46 minutes).The 
transfer station being located furthest from Monday collection area has an 
influence on the former while the fewest number of households has an 
influence on the latter. Wednesday collection in Fairview neighborhoods has 
the greatest number of households resulting in the longest collection time of 74 
hours and 50 minutes. This represents a 15 percent increase in collection time 
from the least on Monday. The greatest number of trips to the transfer station 
(52) also occurs on Wednesday. With the transfer station nearby, the fewest 
miles travelled (820.7 miles) occurs during Thursday collection. This 
represents a 35 percent decrease from the longest distance on Monday. 
Overall, results suggest that the greatest distance traveled coincides with 
increasing distance from the transfer station and the longest hours of operation 
corresponds with increasing number of households. 


Table 2. Scenario 1 — Current Collection — Solid Waste Collection Statistics 


campy | size | wis |HR| win] pist [mans] nns [He] win] oist Trans] nns [He] Min] oist |mrans| nns |e] Mw Dist | TRANS 
A targe] 873 | a | 59| 127. }1,504|12| 25 | ozs| a [1,261] 9 | 10 | 672| 3| 
B Mid 71 ` 123| 1 | 44 25.2 24.0 1 
C Mid 79 1 | 40 20.9 18.2 1 
D Mid 8 76.9 4 
E Mid 3 38.8 3 
F Mini 0 10.0 1 
G Mini 0 11.0 1 
H Large 30.5 1 
I Large 94.9 $ 1 
K Mini 29 59.3 6 
L Mid 1 
M Mini 6 
N Mid | 417 | 6 5 
o Mid 32 5 1 
P Mid 79 29 3 
Q Mid | 270 | 4 | 16 73.7 x 50 3 
R Mid 3 5 55 54 2 
S Large} 238 | 2 | 49 40.9 3 1. 317) 3 9 33 1 
r |mni 
TOTAL 72 74 
Total Time: 282 Hours and 32 Minutes Total Distance: 4,017.3 miles Total Trips to Transfer Station: 188 


NOTE: CMPY: Company, HHs = Number of Households, HR = Hours, MIN = Minutes, DIST = Distance Travelled, TRNS = Trips to 
Transfer Station, S1 = Scenario 1. 
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With the greatest number of customers, Company A has the longest 
collection times on each of the four days. One positive is their large hauler, 
thus reducing trips to the transfer station. On two occasions (Tuesday and 
Wednesday), they violate the 10-hour collection time and would need a 
minimum of two trucks to complete their route in the allotted time. With the 
exception of Tuesday and Wednesday, Company A does not travel the longest 
distances. Company K, with only 175 households, covers the longest distance 
of 135.6 miles on Monday. 

A combination of a mini hauler and lengthy distance from transfer station 
not only results in a long total travel distance but a collection time of 
approximately 6 hours and 24 minutes for 175 households. Company D covers 
the longest distance of 89.6 miles on Thursday. This is a result of the large 
number of households they serve and their mid-size hauler. Other companies 
require a lesser amount of time to complete their routes— 8 to 9 companies 
finish within 2 hours. Companies F and G finish in less than an hour. Multiple 
companies make one trip to the transfer station with a truck that does not once 
reach capacity. 

When interpreting results, Company I is an anomaly because its 
headquarters are approximately 65 miles (1 hour 20 minutes) from the City of 
Altoona. This time and distance is part of their total solid waste collection time 
and travel distance. One wonders why a company with less than three percent 
of customers travels such a distance to collect solid waste. The freedom to 
choose system virtually allows anyone to collect in the city if they follow a 
few regulations. 


4.2. Scenario 2 


Results for Scenario 2 are very similar to current collectionwith the 
greatest number of miles traveled on Monday and longest collection time on 
Wednesday (Table 3). 

However, the design of controlled collection has greatly reduced miles 
travel and collection time for all four collection days. The combined total 
distance traveled on all four collection days decreased from 4,017.3 miles to 
1,173.6 miles, a 70 percent decrease. Thursday collection in Juniata has the 
largest decrease in miles traveled with a 73.8 percent reduction (820.7 miles to 
215.1 miles). The combined total collection time for all four collection days 
decreased 44 percent from 282 hours and 32 minutes to 156 hours and 2 
minutes. 
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Monday and Thursday collection each reduced collection time by 50 
percent with Wednesday collection having the smallest reduction at 38 
percent. The controlled collection scenario would reduce the overall number of 
trips to the transfer station across all four collection days by 52 percent (188 to 


90). 


4.3. Scenario 3 


Results from the improved efficiency scenario illustrate the maximum 
amount of 25-yard collection trucks required for each of the four collection 
days. The number of trucks ranges from a low of three during Monday 
collection to ahigh of five on Wednesday (Table 4). 

This is a substantial reduction compared to current 20 solid waste 
collection trucks of varying sizes. Compared to Scenario 1, the combined 
number of miles traveled decreases by 76.7 percent (4,017.3 miles to 973.3 
miles). Each of the four collection days reduced their miles traveled by at least 
70 percent with Monday collection in Eldorado the largest reduction at 80.5 
percent less. Similarly, the combined total collection time was reduced by 49.6 
percent (282 hours and 32 minutes to 142 hours and 20 minutes). 

Monday collection with the least number of households has the largest 
decrease in collection time (59.0%) while Wednesday collection with the 
greatest number of houses has the smallest decrease (41.9%). The combined 
number of trips to the transfer station decreased from 118 to 56, a 70.2 percent 
reduction. The single largest decrease occurs on Thursday collection where the 
number of trips to the transfer station decreases from 47 to 12 (75.4 percent 
decrease). 

Improved efficiency also improves Scenario 2. The combined number of 
miles traveled is 20.1 percent less (1,173.6 to 973.3 miles). The percent 
reduction ranges from 6.6 percent (Thursday) to 30.0 percent (Monday). The 
former contains the greatest number of households and the latter the least. 

There is a 12.6 percent reduction in collection time — 156 hours and 2 
minutes to 136 hours and 24 minutes. The largest occurs during Thursday 
collection in Juniata (20.4 percent) and the least during Wednesday collection 
in Fairview (5.5 percent). Scenario 3 requires 38.9 percent fewer trips to the 
transfer station than Scenario 2 (90 to 55). 


Table 3. Scenario 2 — Controlled Collection — Solid Waste Collection Statistics 


| | Monday tidorado | Tuesday Pleasant Valley | Wednesday - Fairview | Thursday Juniata 
Jempy| size | wis | Hr | man | oist | rans] His |HR|Min| oist | Tens] His |HR|MiN| oist | TRNS| HHs [Hrm| DIST | TRNS 
| a | tare | 600 | a | 37 | 343] 2 | a96| 6/44] asal 3 [rosaa] s| mal 3 | 78/5/14] 250] 2 | 
pe | wid {seo | 6 | o | sar 4 bass | 7 sé} sists luovsl 7 sa} seal _¢ | roel ae) ara} 5 | 

Mia | 600/ 6 | 1 | s| 4 | a95|7| 31| seol s [Loa s| 2 | ssal e | 797 | 6| 36 | 472] 5 | 
j soo | 4 |as | ara| 2 | a95 | 6 | 25| 39.7] 3 |1034|7| 19| ses) 3 | 798| 5 | 28| si| 2 | 


one fen 


Mid 


OAE TEN OUE T AEE EE 


targe | s00 a |33| 324] 2 |es|elas| aral 3 rose) 7/7] 502] 3 | eis] e| 2s 2 
oa foso are loer Yaar a l aea] oe [eon || aral ar [azae s as Tasa a 
|% Reductionsa | | 50.7% |7a2%|ss.o%| | a07% | ess%|s1o%| | 38.6% | 688%|aga%] | 50.1% |73.8% 55.3% 


TOTAL S2 
% Reduction S1 


NOTE: S2 = Scenario 2. 


Total Time: 156 Hours and 2 Minute: Total Distance: 1,173.6 miles Total Trips to Transfer Station: 90 
Total Time: 44.8% Less Total Distance: 70.8% Less Total Trips to Transfer Station: 52.1% Less 


Table 4. Scenario 3 — Improved Efficiency — Solid Waste Collection Statistics 


fT Monday - Eldorado Tuesday - Pleasant Valley 
| compy | sze | wus [Hr] in | oist | tens] Hus [HR] Min] pist | Tens] HHs |HR| Mn] bist | Tens] HHs | HR| MIN] oist | TRNS: 
[a | tage [2570 [20] o | sil a fasli] o | sorl a [aaoi] 0 | e27) « fislo] ol sae] a | 
fee ee ee ee ee 

wee | es [e [o| seal s fra |n[ 0 esa a fasono] 0 | eal a fasso | 0 | sf 


Large 1,144 50.0 1,430 | 10 64.0 44.2| 3 
Large ---- ---- | ---- ---- 481 | 3 | 27 26.1| 2 ---- [nnn] ---- | ---- ---- 


| Toi | 3,599] 26 | 32 | 2470| 11 | 5.371] 38/14 | 239.5) 15 | 6,201] 43] 27 | 2780 18 | 4,786) 3a| 16 | 2087| 12 | 
a a ee eo 
| %Reductions2 | | 168% |30.0%|s80%{ | 10.5% |223% 37.5%] | ss% | 66%jasan] | 20.4% | 19.6%] 47.6% 


TOTAL S3 Total Time: 136 Hours and 24 Minutes Total Distance: 973.3 miles Total Trips to Transfer Station: 55 
% Reduction S1 Total Time: 51.7% Less Total Distance: 76.7% Less Total Trips to Transfer Station: 70.7% Less 
% Reduction S2 Total Time: 12.6% Less Total Distance: 20.1% Less Total Trips to Transfer Station: 38.9% Less 
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5. DISCUSSION 


By modeling the current collection system and proposing two alternate 
scenarios, results confirm the initial hypothesis that the current collection 
system is highly inefficient and can be improved. This is most evident through 
the long collection time, lengthy distance traveled, and substantial number of 
trips to the transfer station on each collection day. The obvious problem with 
the current collection system is the number of solid waste companies and the 
varying number of customers each serves. Manifesting the problem is the size 
of the collection truck used by several companies, particularly those with a 
mini hauler that serve too few or too many households. For instance, 
Companies F and G serve such a low percentage of households that each 
finishes collection in less than an hour on each day. They are accounting for 
unnecessary miles traveled and trips to the transfer station that could otherwise 
be served by larger companies. Conversely, Companies K and M with a mini 
hauler serve a larger percentage of customers. This requires several trips to the 
transfer station translating to a much longer collection time and distance 
traveled. For comparison purposes, Company E with only 23 more households 
than Company K requires approximately half the time, distance, and trips to 
transfer station because it utilizes a mid-size hauler. However, the same 
argument can be made against Companies I and K that have large haulers but 
only serve a few households. Regardless of how one analyzes current 
collection, the system is highly inefficient from the number of solid waste 
collectors, most truck sizes, number of miles traveled, collection time, and the 
number of trips to the transfer station. Fortunately, this research proposed and 
demonstrated alternate scenarios that greatly improve upon these variables 
regardless which scenario is chosen. 

Scenario 2 — Controlled Collection improves current collection by 50 
percent in all variables with the exception of total collection time. This 
reduction cannot solely be attributed to decreasing the number of solid waste 
collectors to six but also because their households were geographically 
clustered. With each collector serving a particular cluster, the amount of 
driving across the entire city was reduced. The only driving outside their 
cluster involved trips to the transfer station. This translated to a 70 percent 
reduction in the combined total distance traveled on each of the four collection 
days. The elimination of mini-size haulers also reduces miles traveled. The 
elimination of several companies and geographic clustering of customers 
presented in Scenario 2 is a similar approach instituted by the City of Portland, 
Oregon, USA twenty years ago. Solid waste, recycling, and composting 
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collection services are provided to residents under a franchise system that 
limits the number of companies authorized to provide service. The city assigns 
each company a section of the city and also regulates the rates companies are 
allowed to charge, determined through a comprehensive rate study (Hong, 
Adams, and Love 1993; Hong and Adams 1999; City of Portland 2012). This 
structure has allowed the City of Portland to promote high quality solid waste, 
recycling, and composting collection services while simultaneously 
maximizing recycling participation and recovery. 

A downfall to implementing Scenario 2 is that 14 independent solid waste 
companies would lose their customer base in the City of Altoona and this 
could impact the local economy by possibly putting small collection 
companies out of business. However, a reduction in the number of companies 
could potentially save households approximately $7.50US per month. This 
translates to a savings of 1.75US million dollars each year (19,473 households 
* 12 months * $7.50US). Households could invest this savings back into local 
economy. With Scenario 2, one must weigh the elimination of collection 
companies against reducing collection times and distances which ultimately 
improves air and noise pollution and road degradation. 

There are two local companies that work within the City of Altoona that 
have multiple large sized haulers. Scenario 3, improved efficiency, provides a 
realistic solution for these companies to complete solid waste collection within 
the city. Implementing this scenario would reduce collection time, distance 
traveled, and trips to transfer station by 51.7, 76.6, and 70.7 percent compared 
to Scenario 1. These savings are among the highest found in academic 
literature related to GIS optimization of solid waste collection. The largest 
savings were noted in the study performed by Apaydin and Gonullu (2008). 
They applied a shortest path model within GIS to Trabzon City, Turkey and 
noted a decrease of 44.3 and 24.6 percent in route time and route distant for 
nine routes in 26 districts. The differences in savings between this research 
and Apaydin and Gonullu (2008) can be attributed to either model design or 
the current collection system in the City of Altoona is among the most 
unorganized. 

Results from this research demonstrate that city officials must address 
solid waste collection. At a minimum they should implement a collection 
system similar to Scenario 2. This not only provides savings in terms of 
collection time and distance traveled, it will allow officials to better monitor 
and enforce curbside collection, especially recyclables. Solid waste companies 
and city residents could then be held more accountable. Also, the exact 
locations of each company’s customers are known with scenarios 2 and 3. This 
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provides the advantage of using the VRP model to establish driving directions 
for each company that provides the optimal collection route by minimizing 
travel time. 


5.1. Limitations 


Modeling is an effective approach to better understanding how certain 
phenomena behave when, in reality, it is impossible to obtain that 
understanding through actual implementation or real operations. A review of 
the academic literature confirms that GIS network analyst is an established 
approach to analyze and improve upon solid waste collection strategies. 
However, it should be noted that models can only offer one realization of 
many possible scenarios under a specific set of conditions set forth by the 
model’s assumptions. This research attempted to minimize any assumptions 
through field work and data collected by the IRC. Some assumptions could 
have been avoided had local companies been willing to provide information on 
their collection strategy. This research assumes a uniform amount of 50 
pounds of solid waste per household each week. While no one can estimate the 
amount of waste per household, this amount is based on the total amount of 
solid waste each company brings to the transfer stations. The totals are specific 
to the City of Altoona and provide the most accurate representation. 

Solid waste collection trucks were placed into three categories of mini (2 
cubic yard loader), mid (12 cubic yard loader), and large (25 cubic yard 
loader). However, some companies have multiple trucks of varying sizes, 
including 14- and 20-yard loaders. The three categories best represent the 
truck fleet observed. Even if the truck size of every company was confirmed, 
there are several factors that affect the amount of solid waste a full truck can 
haul. First is the age of the truck. New trucks pack much better than older 
trucks due to hydraulic fatigue in older trucks. Truck maintenance is also 
important as neglecting to grease fittings on a weekly basis weakens the 
compactors ability to compact solid waste. A second factor is how households 
choose to place their solid waste on curbside for pick-up. Waste placed in trash 
cans and bins without a lid is prone to fill with rainwater or snow. Wet refuse 
is much heavier but compacts much better. Last are the employees on the 
routes. The more often they compact the waste in the truck the better 
compaction weight one gets. Some employees only compact when the hopper 
is completely filled thus reducing the amount of waste the truck can haul. Even 
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if the size of each company’s hauler were known, these factors make it 
difficult to quantify the total amount of solid waste each could hold. 

One final assumption to consider is the application of VRP to Scenario 1. 
The primary goal of VRP is to service each household by minimizing the 
overall collection time for each solid waste collection company. Thus, 
Scenario 1 results are based on the optimal collection route when in reality, 
most if not all companies are driving completely different routes. The only 
way to truly model current collection is for companies to share their collection 
strategies and customer lists. Until then, results from the VRP model as it 
relates to current collection cannot be validated as reality. 


CONCLUSION 


Solid waste management system planning has received wide attention 
from environmental planners because of its complex coordination of various 
management strategies. One issue is how to effectively distribute the 
collection crew size and vehicles in a metropolitan region. This research 
applied the VRP function to solid waste collection in the City of Altoona, 
Pennsylvania, USA. 

Focus was placed on understanding the current collection strategy and 
proposing two alternate scenarios. Results indicate the current collection 
strategy with 20 separate collection companies and a fragmented customer 
base is highly inefficient in terms of collection time, distance traveled, and 
number of trips to the transfer station. Each of the alternate scenarios greatly 
improves upon current collection in all areas. Controlled collection scenario 
with six collection companies and clustered customers offers a savings of 
approximately 70 percent in distance traveled and 44 percent in collection 
time. 

Additional savings were demonstrated with the improved efficiency 
scenario as distance traveled and collection time were 76 and 50 percent less 
than current collection. Improved efficiency provides further savings 
compared to controlled collection as the distance and time traveled is 20 and 
12 percent less. 

Savings in terms of distance traveled and collection time are significant 
for a city’s operating budget as the cost of waste collection is a very large part 
of the operating budget. The two alternate scenarios demonstrate that 
optimization of waste collection service can significantly reduce collection 
time and distance traveled. This translates to financial savings for collection 


Application of Vehicle Routing Problem ... 29 


companies who can then pass savings onto the customer. In the end, results 
from each scenario emphasize the need for city officials to make changes to 
the current collection system. It is my hope city officials will adopt one of the 
proposed scenarios in the near future. 
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Chapter 2 


CONCEPTUAL FRAMEWORK FOR USING GIS 
IN BUILDING COMMUNITY CAPITAL 
TOWARDS SUSTAINABILITY 


Sungsoon Hwang” 
Department of Geography, DePaul University, Chicago, Illinois, US 


ABSTRACT 


Sustainability—balancing fundamental human needs with ecological 
resilience—has been embraced as an overarching policy goal. And 
communities have been called to participate in the process of attaining 
that ideal. Community-based organizations (CBOs) can benefit from 
using GIS in building community assets and developing sustainability 
initiatives. However, GIS, has not been used widely for these purposes in 
CBOs yet. In this chapter, I illustrate how geographic information (such 
as maps) can be useful in community development drawing from 
community GIS projects, and explain how theories of sustainability and 
spatial thinking can be utilized in community-based efforts towards 
sustainability. CBOs can monitor and assess community sustainability by 
(a) organizing relevant indicators into the capital framework (theories of 
sustainability), and (b) exploring spatial distribution, interactions, 
relationships, and changes in sustainability-related issues using GIS 
(spatial thinking). The framework presented here can be applied to 
promote effective use of geospatial tools for community sustainability. 
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1. INTRODUCTION 


This chapter presents how GIS can be used to help build community 
capital towards sustainability. Community capital is an existing community 
asset or resource that can be leveraged to improve quality of life in the 
community and bring about desired social change (Kretzmann & McKnight, 
1993; Roseland & Connelly, 2005). Community-based organizations (CBOs) 
work to build and enhance various forms of community capital, including 
environmental capital, physical capital, human capital, political capital, 
financial capital, social capital, and cultural capital (Green & Haines, 2012). 
The community can take a crucial role in sustainable development (United 
Nations [UN], 1992, Section III), and data and tools relevant to sustainable 
community development have exploded due to advances in information and 
communications technology (ICT). Given this, it is timely and important to 
consider how GIS can assist community-based efforts to progress toward 
sustainability. 

GIS can aid the process of building community capital because spatial 
elements are present in various stages of community development from public 
participation and community organizing to community visioning. CBOs use 
geographic information (GI) (such as maps, geospatial data, and tools) to (a) 
support the activities of their staff members (administrative use of GI); (b) 
investigate conditions and plan for the provision of services (strategic use of 
GI); (c) plan specific action around a particular issue (tactical use of GI); and 
(d) persuade and organize more people to get involved in community issues 
and activities either through recruitment or grant-seeking (organizing) (Craig 
& Elwood, 1998). As sustainability emphasizes the balance of environmental, 
economic, and social goals, there is a need to better understand how different 
forms of capital interact with each other and how building specific community 
capital contributes to overall community sustainability. 

To assess whether a community is on a sustainable path, it is necessary to 
monitor the increase or decrease of total stocks of capital over time as well as 
the dynamics of human-environmental interaction unique to the community 
(Daly, 1973). One can analyze how different forms of capital—natural capital, 
built capital, human capital, and social capital—are related or in balance to 
assess sustainability using the capital framework in an information system 
(Meadows, 1998). This framework has advantages because it attempts to 
integrate different forms of capital, and enables an analysis of sustainability 
condition (strong and weak sustainability) informed by economic theories of 
sustainability (Solow, 1986; Pearce & Atkinson, 1993). The capital framework 
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is criticized, however, for problems of data availability and lack of attention to 
intra-generational equity (UN, 2007). 

One can now objectively monitor aspects of sustainability that have been 
elusive to measure (e.g., indicators that fall into natural capital and social 
capital) through recent developments in geospatial technology, including 
hyperspectral and radar remote sensing, sensor networks, ubiquitous GPS 
tracking, and GeoWeb 2.0. Along with increased quantity and quality of data, 
GIS can provide a platform to integrate data from various sources across 
different geographic scales (global to local) and investigate the changing 
nature of the planet (National Research Council [NRC], 1997; NRC, 2010). As 
society continues to improve access to data across various domains (social to 
environmental) over space and time, it is likely that spatial thinking (NRC, 
2006) will play an important role in turning big data into knowledge in 
conjunction with relevant analytics and tools. Visual images can condense a 
large amount of data and present a message effectively enough to change 
behaviors (Sheppard, 2005). GIS is well suited to supporting the complex task 
of assessing sustainability (Graymore et al., 2009). 

Research shows that CBOs do not use GI as effectively as they should 
because they lack organizational capabilities, continuity, and connections 
(social networks) to deploy GIS; nor is their use of GI is well aligned with 
organizational contexts and structure (Sieber, 2000; Elwood & Ghose, 2001; 
Esnard, 2007). CBOs do not use GI simply because they don’t know how to 
use GIS or what they can do with GIS, and also because they are not 
compelled to pose geographic questions. They are not compelled to discern 
geographic elements in planning and conducting their activities because they 
are rarely aware of the value of GI or trained to think spatially. In other words, 
CBOs’ (and the general public’s) inadequate appreciation of geospatial 
concepts in part contributes to underutilization of GI. Geographic concepts and 
GIS are conducive to sustainability discourse (Wilbanks, 1994; Whitehead, 
2006; Hwang, 2013), and thus it is necessary to promote the use of GI in 
CBOs’ efforts toward sustainable development. The purpose of this chapter is 
to demonstrate the utility of GI in building community capital and to present a 
conceptual framework to help CBOs to think spatially and increase the 
efficacy of their sustainability initiatives. 

This chapter is organized into three sections. In Section 2, I discuss the 
role of GIS in building community capital, largely illustrated by community- 
based service-learning GIS projects. Then I present how to organize different 
forms of capital to assess sustainability in community following the capital 
framework in Section 3. I discuss the typology of geospatial inquiries suited to 
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exploring sustainability-related issues in Section 4. Spatial thinking used to 
assess sustainability following the capital framework can guard against losing 
sight of holistic, context-specific, and dynamic notions of sustainability. 


2. BUILDING COMMUNITY CAPITAL WITH GIS 


In this section I present case studies in which GIS was applied to building 
community capital. The section is organized into different forms of 
community capital—namely, natural capital, built capital, human capital, and 
social capital—to connect with the capital framework presented in the next 
section. The case studies presented in this section are drawn from service- 
learning projects that students enrolled in the GIS program at DePaul 
University conducted in partnership with CBOs or non-profit organizations in 
the Chicagoland area from 2007 to 2013. The names of the partner CBOs are 
shown in italics in this section. 
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Figure 1. Kernel density map of healthy trees in Lincoln Park, Chicago (2007). 
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Natural Capital 


Natural capital is a community’s base of natural resources including air, 
water, land, flora, and fauna. Natural resources provide ecosystem service 
(e.g., flood control, waste assimilation) and are extracted as an input of 
production (e.g., fossil fuel, timber, tourism) (Green & Haines, 2011). GIS 
students surveyed the location and health (measured as diameter at breast 
height [DBH]) of trees in a Chicago neighborhood using GPS receivers (Frye 
et al., 2007). 
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Figure 2. Coal power plants and percent of Latino population by community areas in 
Chicago (2009). 
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Figure 1 shows the kernel density of trees weighted by DBH in Lincoln 
Park. In the map, the darker the shade, the greater the concentration of healthy 
trees. The map indicates that trees are less healthy in non-residential areas 
around major roads with much vehicular traffic (the southern part of the map) 
than in highly residential areas (the northwestern part of the map). This project 
demonstrates how vehicle emissions affect tree health. 

GIS students in collaboration with Little Village Environmental Justice 
Organization (LVEJO) examined whether communities with a large Latino 
population were more likely to have a power plant and what related health 
effects might be (Becerra et al., 2009). Figure 2 shows that coal power plants 
are located in predominantly Latino communities in Chicago. 
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Figure 3. Single room occupancy housing in Chicago (2012). 
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Built Capital 


Built capital includes the buildings, infrastructure (roads, bridges, sewers), 
and other physical features (railroad tracks, vacant land) in the community. 
Affordable housing has been a major concern among CBOs, particularly in 
low-income communities. Single room occupancy housing (SRO’s), one 
source of affordable housing, has been on the verge of being converted to 
condos in Chicago. GIS students helped Lakeview Action Coalition locate 
SROs and other affordable housing stocks using GIS mapping (Cameron et al., 
2013). Figure 3 shows where SROs are located, and in what wards they are 
concentrated in the North Side of Chicago. CBOs can use this kind of map to 
protest against market pressure that would potentially displace low-income 
populations and to determine where to focus their organizing efforts. 
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Figure 4. Bike infrastructure in Little Village, Chicago as of October 2010. 


Driving is a predominant way of getting around in Little Village, Chicago. 
In the process of a grant application, Enlace Chicago surveyed the 
transportation infrastructure and traffic patterns by different modes of 
transportation in Little Village. They used GIS to demonstrate the need for 
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alternative modes of transportation (e.g., bicycling, public transportation) as a 
means to mitigate traffic congestion. Figure 4 shows bike meters and racks 
mapped by GIS students as of October 2010 in Little Village (Carlstrom & 
Vasquez, 2010). Figure 5 represents the relative share of traffic by different 
modes of transportation at major road intersections surveyed by GIS students 
as of March 2011 in Little Village (Boyter et al., 2011). The figure shows that 
cars dominate the traffic (80%) while public transportation (city buses) 
accounts for 1% of traffic. 
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Figure 5. Survey of traffic in major road intersections in Little Village, Chicago as of 
March 2011. 


S Springfield Ave 


Existing community assets such as vacant lots can be used as public space 
for youth education. Camp Bufferfly wanted to determine the availability of 
space that can be potentially used for green youth education with GIS. GIS 
students identified vacant lots, brown field sites, youth centers, and 
community gardens using GIS to address the needs of Camp Butterfly in the 
Bronzeville community (Chaffin et al., 2012). This project demonstrates how 
built capital can be utilized in relation to building human capital in the 
community. 


Conceptual Framework for Using GIS in Building Community ... 41 


Human Capital 


Human capital, as an essential community asset, refers to the 
characteristics of individuals performing within the labor market, including 
educational background, work experience, and health. Since human capital is 
closely related to the economic health of a community, many CBOs implement 
workforce development programs. Immigrants in particular do not fully 
participate in the labor market due to their lack of language skills. Chicago 
Federation of Labor’s Workers Assistance Committee (CFL-WAC) provides 
maps of free ESL (English as a Second Language) centers to those who lack 
English skills (Brost et al., 2007). 
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Figure 6. ESL centers and population without English skills in Chicago (2007). 
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Figure 6 shows where free ESL centers are located and where those 
without English skills are concentrated at tract levels in Chicago. The figure 
indicates that the far western and southeastern parts of Chicago do not have 
enough ESL services for the high proportion of those who need such services 
while other areas are relatively well-served by ESL centers. 

Adequate access to child care can be greatly instrumental in supporting 
workforce development. Illinois Action for Children used GIS to identify so- 
called “child care deserts” in Cook County to determine where the supply of 
child care falls short of the likely demand using GIS (Stoll et al., 2009). For 
this task, GIS students calculated the difference between children likely in 
need of child care and the slots available in child care centers in each tract. 
Figure 7 shows how a tract performs relative to other tracts in meeting child 
care needs. In the map areas shaded brown or dark red represent child care 
deserts where the demand far exceeds the supply of child care. 

The child obesity rate is nearly three times the national average in 
Humboldt Park, Chicago. In the wake of this significant community health 
issue, Puerto Rican Cultural Center wanted to assess access to healthy food 
and to inventory community assets that can be utilized to address this 
community health issue. Students surveyed and mapped community assets and 
the nutritional value of food venues. Figure 8 shows religious institutions, 
community assets (community gardens, farmers markets, and health and 
fitness programs), and grocers and restaurants categorized by nutritional value 
(produce options, limited produce options, no produce options) in Humboldt 
Park (Stutsman et al., 2010). Figure 9 shows what areas are within walking 
distance (500 meters) of grocery stores with fresh produce options (Knight et 
al., 2011). While there are various food venues in Humboldt Park, food venues 
with high nutritional value are not widely accessible. 

Chicago Public Schools (CPS) launched the Safe Haven, Safe Summer 
program in partnership with faith-based communities. The program offers free 
educational programs to help keep youth away from crime particularly in low- 
income neighborhoods. GIS can be used to determine where the programs 
might be needed. Enlace Chicago used GIS to assess violent crimes committed 
near schools in connection with the area’s youth population to determine 
potential demand for the Safe Haven program in Little Village. The percent of 
youth was mapped in Figure 10 to help identify potential demand, and the map 
was compared to available crime data nearby (Luna et al., 2012). 

The locations of emergency food providers (such as food pantries) were 
mapped to assist Little Village and Pilsen residents who seek food assistance 
(Robidoux et al., 2011). Figure 11 can help identify where underserved areas 
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are and determine how emergency food assistance can be coordinated at 
different times and locations. 
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Figure 7. Child Care Deserts in Cook County (2009). 
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Figure 8. Community assets & food vendors in Humboldt Park (2010). 


The Cook County Sheriff’ s Office runs a Boot Camp program to provide 
inmates in Cook County Jail with physical and job training. Chicago 
Federation of Labor’s Workers Assistance Committee (CFL-WAC) helps 
prepare graduates of the Boot Camp for the job market and their return to their 
communities. GIS was used to examine what communities the Boot Camp 
participants return to and what the characteristics of those communities are 
(Ambuehl et al., 2008). This mapping project suggests that if a participant 
returns to neighborhoods with high crime, unemployment, or poverty, then a 
participant is more likely to re-engage in criminal activity. This project 
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provides empirical support for the detrimental influences of negative social 


capital on human capital and opportunity structures (neighborhood effects) 
(Sampson et al., 2002). 
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Figure 9. Areas within 500 meters from grocery stores with fresh produce in Humboldt 
Park (2011). 
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% Youth Population in Little Village 
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Figure 11. Food pantries of Little Village & Pilsen in Chicago (2011). 
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Social Capital 


Social capital refers to the characteristics of a collection of individuals that 
are deemed to influence the quality of social relationships in a community, 
such as social trust, norm, and cohesion. In a society with large stocks of 
positive social capital, rules will be enforced fairly and therefore those who 
promote common interests can be rewarded properly. Social capital is 
developed over time in a way that is unique to the community. What 
constitutes social capital is subject to debate, but political engagement, 
corruption, crime, and volunteerism are widely used as measures of social 
capital (Narayan & Cassidy, 2001). 

Many non-profit organizations strive to reach a large number of 
communities in accomplishing their missions. For example, Chicago Fair 
Trade (CFT) wanted to monitor progress towards increasing the number of fair 
trade outlets in Chicago. GIS students created a map (Figure 12) that shows 
where those outlets are located, and which community areas lack fair trade 
outlets (Barron et al., 2011). CFT used the map to determine where they 
should focus their outreach efforts. Institute of Cultural Affairs ICA) collected 
sustainability initiatives in Chicago’s 77 community areas for circa 2012 with 
the help of student interns. Students in the GIS program mapped sustainability 
initiatives in the process (Paschen et al., 2012). ICA created interactive maps 
to facilitate sharing knowledge and resources based on common interests 
towards community sustainability (ICA, 2013). 

In addition to its community outreach and information sharing purposes, 
GIS can help build other subcategories of social capital—namely, political 
capital, financial capital, and cultural capital. CBOs need to build and increase 
political capital, access to decision making, in a community to accomplish 
their missions effectively. CBOs do so by analyzing the local power structure, 
exposing power relations, and educating the public on pertinent issues. GIS 
can help exhibit uneven distribution of power and its consequences in the 
community concerning environmental justice, zoning changes, site proposals, 
and so on. GIS has long been advocated as a tool for public participation due 
to its ability to help engage the public and to help build common 
understanding on issues. CBOs make spatial narratives to advance their 
agendas using GIS (Elwood, 2006). 

Latino Policy Forum mapped the Latino voting age population by election 
districts (Figure 13) to devise place-based strategies for mobilizing their 
constituencies (Hernandez et al., 2013). LVEJO exposed uneven power 
relations manifested in spatial distribution of hazardous waste sites (Becerra et 
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al., 2009). Active Transportation Alliance (formerly known as Chicagoland 
Bicycle Federation) identified bicycle crash hot spots (Figure 14) using GIS to 
raise awareness of bicyclers’ safety issues and to make evidence-based 
recommendations for promoting bicycling (Weiss & Rygh, 2007). 
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Figure 12. Fair trade outlets in Chicago (2011). 


While access to financial capital is crucial to economic development, 
credit markets do not respond well to the needs of poor communities. The poor 
and minority often experience discriminatory and predatory lending, but lack 
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the resources to meet their credit needs. Underserved communities have been 
vulnerable to market volatility as observed during the recent recession. CBOs 


protest against unfair lending practices and work to build local credit markets 
while navigating the available policy tools. 
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Figure 13. Continued on next page. 
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Figure 13. Latino voting age population by U.S. Congressional Districts in Illinois 
(2010). 


Latino Policy Forum assessed the impact of foreclosures on the Latino 
population using data provided through the Home Mortgage Disclosure Act 
(HMDA). They used GIS as a visual tool to demonstrate the uneven impact on 
the Latino population (McConnaughhay et al., 2013). Figure 15 shows that 
foreclosure auctions were largely concentrated in predominantly Latino 
communities during 2010 - 2012. 
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Figure 14. Number of bicycle crashes by community areas in Chicago (2005). 
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A growing number of CBOs (and governments) have relied on enhancing 
cultural capital as a strategy to revitalize the local economy and promote a 
sense of community. Their activities include hosting art-related events, 
designating historic and conservation districts, and attracting “creative classes” 
(Florida, 2014). Bronzeville Visitor Information Center believes that the 
cultural assets scattered around Bronzeville can be leveraged to promote 
tourism and generate local job opportunities. GIS was utilized to map cultural 
assets as a source for developing cultural art tours in the community (Nieciak 


et al., 2012) (Figure 16). 
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Figure 15. Percent of auctions from 2010 to 2012 by Chicago’s community areas. 


Different forms of capital are not necessarily mutually exclusive, but 
rather mutually reinforce one another. For example, workforce development 
and civic education (human capital) affects social trust (social capital). 
Affordable housing (built capital) cannot be developed without the proper use 
of financial capital (access to credit) (Green & Haines, 2012). Characteristics 
of social capital unique to community can affect the rate in which innovation 
drives economic development (e.g., Silicon Valley) (NRC, 1997). 
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CBOs can benefit from an improved understanding of the relationships 
among different forms of capital. This can lead to designs for smart programs 
that target environmental, social, and economic sustainability goals in a 
balanced manner. Urban agriculture or community gardens initiatives are good 
examples of integrated programs designed to develop the workforce and 
address food security issues while reducing carbon footprint at the same time. 
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Figure 16. Tourism maps for Bronzeville, Chicago: cultural assets in Zone 1 (2012). 


It appears that CBOs develop both specialized and integrated programs 
these days. For instance, CBOs can offer financial advice and technology 
training to community members facing the foreclosure crisis and joblessness in 
the community. While this (specialization of programs) is a welcome trend, 
more integrated programs that cut across natural and human systems are 
needed as sustainability increasingly becomes a new political ideal. This 
section suggests that thus far CBOs have been less active in developing 
programs related to natural capital than in those related to man-made capital. 
This, however, seems to be changing. According to the ICA (2013), there were 
over 900 sustainability initiatives in Chicago’s 77 community areas as of Fall 
2013. It should be noted, however, that there is ambiguity over what are 
considered sustainability initiatives, and the quality of the sustainability 
initiatives data posted on the ICA’s website is not verified. CBOs can apply 
the capital framework to analyze the interaction among different forms of 
capital and devise integrated programs toward sustainability. 
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3. ASSESSING COMMUNITY SUSTAINABILITY USING 
THE CAPITAL FRAMEWORK 


There is increasing recognition that we ought to live within ecological 
means so that we do not compromise the capabilities of future generations to 
meet their own needs (Brudtland, 1987). The concept of sustainability has 
arisen from debates on how to balance the goal of economic and social 
development with environmental conservation. A consensus on human 
modification of the environment has emerged among the scientific community 
as scientists call for collective action to prepare for climate change (Solomon, 
2007). More and more communities are compelled to participate in the 
collaborative process towards sustainable development as communities 
recognize that sustainability is essential to their survival. Community-based 
sustainability initiatives, however, have been barely informed by theories of 
sustainability or supported by tools for assessing sustainability. 


Capital Theory of Sustainability 


Economic theories of sustainability extend the concept of capital to 
include nature. Natural capital has two components: resource capital and 
environmental quality (Purvis & Grainger, 2005). Resource capital refers to 
the stocks of all natural resources that are either renewable (e.g., forests, 
fisheries) or non-renewable (e.g., fossil fuels). Environmental quality refers to 
the condition of environmental sinks such as land, water, and air that provide 
an assimilative function. 

Critical natural capital is capital that is essential to support life and thus 
should be maintained under any circumstances favoring present and future 
generations (Brand, 2009); examples include areas of tropical forests with high 
biodiversity and carbon stocks. Theories state that a development path is 
strongly sustainable if critical natural capital does not decline (Solow, 1986) 
and is weakly sustainable if total stocks of capital (natural, built, human, and 
social capital) do not decline (Pearce & Atkinson, 1993). The difference 
between so-called strong and weak sustainability condition comes from 
different views on the relations between economy and environment. 

If environment subsumes economy, sustainability means maintenance of 
critical natural capital (that is, strong sustainability) since all man-made capital 
(built, human, and social capital) depends on natural capital. If environment 
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does not necessarily subsume economy but rather is interdependent on 
economy, sustainability means maintenance of total stocks of capital (that is, 
weak sustainability) since improvement in man-made capital (such as 
development of alternative energy, enforcement of pollution tax) can offset 
some decline in natural capital. 

The first view is significant in that it specifies the minimal necessary 
condition of sustainability and inspires environmentalists to action. This view, 
however, neglects the human dimension and what constitutes critical natural 
capital is subject to debate. That is, the view does not recognize the potential 
of man-made capital to improve natural capital. The Green Belt Movement in 
Kenya illustrates the role of man-made capital that mutually reinforces natural 
capital—how communities organize themselves to choose the path to preserve 
natural resources by planting trees to educate the workforce and improve the 
quality of life in their communities. 

The second view suggests that one can minimize consumption of natural 
resources while finding solutions, leaving room for human efforts. However, it 
is unrealistic to presume perfect substitutive relationships among natural 
capital and man-made capital since some critical natural capital simply has no 
substitute. A more realistic view is to preserve critical natural capital while 
acknowledging the tradeoff relations that exist between natural capital and 
man-made capital. Therefore, sustainability cannot be assessed without an 
understanding of this nature-society relation. The capital theory of 
sustainability provides a calculus for integrating natural and human systems 
(i.e., natural capital and man-made capital in economic terms), and offers a 
starting point to assess sustainability in an integrative manner. Once the 
concept of sustainability is theorized, aspects of sustainability can be measured 
to assess sustainability objectively. 


Developing and Organizing Sustainability Indicators Informed 
by Capital Theory 


What is measured can be managed. Without indicators, it would be hard to 
monitor progress toward sustainability. Sustainability indicators should serve 
as a pointer to symptoms of system (e.g., Biochemical Oxygen Demand or 
BOD to monitor the self-cleansing of water), and an orientor of systems (e.g., 
the rate of fossil fuel exploitation/rate of alternative energy development) 
(Meadows, 1998). Additionally, good sustainability indicators should be 
integrative (i.e., should portray linkages among the environmental, economic, 
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and social dimensions of sustainability) and be distributional (i.e., should 
measure intra-generational equity in addition to inter-generational equity) 
(Maclaren, 1996). Many sustainability indicators are global or national in 
scope; fewer have been developed at the community level. Research 
recognizes that developing sustainability indicators at the community level 
involves collaborative and adaptive learning processes (Valentin & 
Spangenberg, 2000; Reed et al., 2006). Little research examines the role of 
technology (such as GIS) in this process. 

Sustainability indicators can be organized into natural capital, built/human 
capital, and human/social capital according to Daly (1973). An advantage of 
organizing sustainability indicators this way is that the framework can be tied 
to the capital theory discussed above (UN, 2007). Humans have modified 
natural ecosystems by exploiting resource stocks and depositing waste into 
sinks. At the same time, technology and political economy, the realm of 
human dimensions, can be deployed to minimize detrimental environmental 
effects. Sustainability indicators can be organized so that the dynamics of 
human-environmental interaction can be monitored and examined. The 
framework is hierarchical in that it acknowledges the fundamental quality of 
natural capital as the foundation of human economy, and is integrative in that 
it emphasizes the interaction between natural capital and man-made capital in 
relation to sustainability of the planet. 

The Daly Triangle (Daly, 1973) provides an idealized view of the capital 
framework (Figure 17). The triangle is premised on the idea that the economy 
is ultimately built to fulfill human well-being and all other forms of capital 
ultimately rest on natural capital. Therefore, well-being can be seen as ultimate 
ends and natural capital can be seen as ultimate means. Built capital, human 
capital, and social capital play a crucial role in rendering the ultimate means to 
be sustainably used and the ultimate ends to be sufficiently satisfied 
(Meadows, 1998). The commonly held view of economic growth overlooks 
environmental costs and the productive nature of well-functioning governance. 
In other words, the Daly Triangle attempts to revise the economic system from 
the system that only accounts for built and human capital as means; to the 
system also accounts for human/social capital as ends that need be in balance 
with natural capital. 
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Using the Capital Framework to Assess Community 
Sustainability 


Once sustainability indicators are organized into four tiers following the 
Daly Triangle, one can monitor progress toward community sustainability by 
posing the following three questions; the questions focus on the sufficiency of 
ultimate ends, channeling from ultimate means to ultimate ends, and the 
sustainability of ultimate means (Meadows, 1998). 


e Are ultimate ends sufficiently realized? 
e How efficiently are ultimate means translated into ultimate ends? 
e [s natural capital (ultimate means) sustainably used? 


The first question regards what “fundamental human needs” are in the face 
of an unsustainable pattern of consumption and production and whether there 
are any barriers to fulfilling human potential (e.g., social injustice, human 
rights issues) in contemporary society. The second question is concerned with 
how institutions function to induce the private sector or civil society to use 
natural resources efficiently and minimize human impacts. For example, do 
institutions enact policy frameworks that internalize (or account for) 
environmental costs to correct potential market failure? Examples include 
pollution taxes, congestion charges, and energy efficiency standards that meet 
community needs and enable adaptation to climate change. The third question 
can be posed to determine whether the rate of natural resource usage exceeds 
the rate of regeneration of stocks, and whether the rate of waste emissions 
exceeds the rate of recovering the absorptive capacities of sinks. 

While the capital framework has some theoretical utility, it encounters 
problems when put into practice in policy-setting and community 
development. The integrative nature of the framework may render the tasks of 
developing indicators overwhelming. There is an ongoing debate over the 
relationship between economy and environment and what constitutes critical 
natural capital, which does not encourage practitioners to utilize this 
framework. Finally, the framework is not explicit about intra-generational 
equity. GIS can help address some of these problems to be discussed later. 

We are still in the process of building a consensus on best practices for 
measuring and assessing sustainability and we do not understand the emergent 
and complex properties of the dynamic human-environmental interaction yet 
(Clark & Dickinson, 2003). As we make this journey, geospatial technology 
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can play a significant role in improving scientific understanding and building a 
consensus that will shape the path towards sustainability. 
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Figure 17. Daly Triangle: the capital framework for organizing sustainability indicators 
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4. ASSESSING COMMUNITY SUSTAINABILITY 
SPATIALLY WITH GIS 


GIS is particularly useful for sustainability assessment since it can 
integrate data in various domains across natural systems (water, air, land) and 
human systems (political, social, economic characteristics). GIS can be used to 
capture aspects of sustainability that have been elusive to measure—such as 
natural capita and social capital—through hyperspectral remote sensing, 
ubiquitous GPS tracking, and social media. In addition, visualization of 
multidimensional data in GIS can compress large amounts of information and 
convey information effectively. With GIS, one can monitor change and 
analyze spatial distribution and the relationships of entities. This (integration, 
visualization, and analysis of appropriate data) can facilitate the collaborative 
learning process of identifying issues related to sustainability, particularly 
through open platforms like GeoWeb 2.0. GIS can help address the limitations 
of the capital framework discussed earlier (i.e., integration of large data, lack 
of understanding of the society-nature relation, lack of attention to 
distributional aspects of sustainability). Despite the potential of GIS discussed 
above, GI has been underutilized due to an inadequate understanding of spatial 
concepts (Marsh et al., 2007). 

Here I present a typology of geospatial inquiries applicable to 
sustainability-related issues. This is intended to address an inadequate 
appreciation of spatial concepts and to guide GIS users to think more spatially 
about sustainability. GIS can address the following five questions (Hwang, 
2013). Working definitions of those geospatial inquiries are provided below: 


e Spatial Distribution (SD): where things are in a place/region 

e Spatial Interactions (SI): how things interact between places/regions 

e Spatial Relationships (SR): where things are related across domains in 
a place/region 

e Spatial Comparisons (SC): where things differ among places/regions 

e Temporal Relationships (TR): how things change in places/regions 


Here thing is used as an umbrella term to refer to objects (spatially 
discrete phenomena such as land parcels and roads), fields (spatially 
continuous phenomena such as temperature and elevation), and events 
(dynamic geographic phenomena such as earthquakes and crime incidents) 
(Goodchild et al., 2007). The term place/region is a domain-specific multi- 
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scalar construct; for instance, climatic regions can be defined specifically for 
the climactic domain at varying geographic scales. 

SD involves recognizing the identity of a spatial entity (e.g., what is a 
watershed?) and identifying the location (or extent) of the spatial entity (e.g., 
where is a watershed?) and spatial distribution of variables (e.g., precipitation 
in a watershed). SI involves recognizing how entities are connected among 
places/regions (e.g., how does city A interact with city B?). SI is equivalent to 
the concept of flow and movement from one place/region to another. SR 
concerns noting how entities of one domain are associated with those of 
another (e.g., how is a runoff level associated with land cover types in a 
watershed?). SR is equivalent to the concept of correlation. SC focuses on 
differences between places/regions (or idiosyncrasy of locality) whereas SI 
focuses on connections between places/regions. TR examines changes in 
where things are, how things interact, how things are related, and how things 
differ. Figure 18 depicts five geospatial inquiries in the geographic matrix 
(Berry, 1964) that synthesizes subject matters of geography—characteristics 
(rows of the matrix) of regions (columns of the matrix) over time (tiers of the 
matrix). 


Regioni | Region2 | Region3 Regionm 


Characteristic 1 SD2: 


Characteristic 2 


Characteristic 3 


Characteristic ... 


Characteristic n 


SD: Spatial Distribution SI: Spatial Interactions SR: Spatial Relationships 
SC: Spatial Comparisons TR: Temporal Relationships 


Figure 18. Geospatial inquiries in the geographic matrix (Hwang, 2013). 
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Geospatial inquiries begin with describing the spatial distribution of 
objects, fields, and events. Performing spatial interaction tasks (e.g., seeing 
how air pollution diffuses, how water flows along streams, and where food 
comes from) is important to understanding how places are interconnected. 
Engaging in spatial relationships inquiry encourages one to make connections 
across domains (interdisciplinary thinking) and better understand cross-cutting 
concepts of sustainability, “improving the quality of human life while living 
within the carrying capacity of supporting ecosystems” (IUCN, 1991). Spatial 
comparisons inquiry is instrumental in revealing place-specific characteristics. 
Temporal relationships inquiry is appropriate for investigating the intrinsically 
dynamic nature of sustainability of the biosphere. 


CONCLUSION 


In this chapter I propose that CBOs can use GIS to (a) help build different 
forms of community capital; (b) assess community sustainability as an 
interaction and a composite of different forms of capital; and (c) explore 
directional (inter-generational equity), distributional (intra-generational 
equity), and relational (integrative) aspects of sustainability. That is, one can 
better understand the dynamic society-nature relationship and monitor 
progress toward sustainability by applying spatial thinking to the capital 
framework in the GIS platform. The chapter illustrates the value of geographic 
information in community development and reviews how economic theories 
and geospatial tools can help foster community sustainability. 

CBOs will develop more environmentally-informed programs as public 
awareness of sustainability increases. The trend will accelerate with an 
improved ability to monitor the features and functions of natural ecosystems. 
The use of remotely sensed imagery or sensor networks in the field will allow 
us to monitor conditions of vegetation, chlorophyll in the ocean, climatic 
conditions, flooding, and water contamination in a timely fashion. Increasingly 
user-friendly geospatial tools will help us monitor these changes in an easy-to- 
understand manner. With further developments in spatial data infrastructure 
(Williamson et al., 2004) including the recent climate initiatives by the White 
House (2014), this future scenario will be realized soon. 
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SPATIO- TEMPORAL VISUALIZATION 
AND ANALYSIS TECHNIQUES IN GIS 


Song Gao* 
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California, Santa Barbara, CA, US 


ABSTRACT 


The spatio-temporal visualization and analysis techniques play 
important roles in geographic information systems and knowledge 
discovery. In this research, we introduce three spatio-temporal analytical 
techniques including spatio-temporal visualization, space-time kernel 
density estimation, and spatio-temporal autocorrelation-analysis for 
exploring human mobility and urban structure patterns. These spatio- 
temporal analytics can help to answer questions like: Where are the 
spatio-temporal hotspots of human activities? How to explore spatio- 
temporal patterns in three-dimension GIS? Experiments are conducted 
using a large scale of phone-call detailed records in urban space. 

Expeditions on the spatio-temporal functionality and techniques 
contribute to the GIScience community for the future development of 
Space-Time GIS; and more broadly, it has potential to be applied in other 
disciplines, e.g., environmental, urban and social sciences. 
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1. INTRODUCTION 


Temporal geographic information systems or Space-Time GIS (STGIS) 
were designed to process, manage, and analyze spatio-temporal data (Yuan, 
1996). In past decades, several spatio-temporal conceptual frameworks such as 
space-time path and prism for exploring individual activities (Hägerstrand, 
1970; Miller, 1991; Yu and Shaw, 2008), the integration of GIS in space-time 
representations (Couclelis, 1999), and key spatio-temporal data models such as 
snapshot models (Armstrong, 1988), event-based models (Peuquet and Duan, 
1995) and object-oriented data models (Worboys, 1992) have been widely 
studied and developed. Goodchild (2013) recently examined seven examples 
of distinct STGIS data types (i.e., tracking, temporal sequences of snapshots, 
temporal sequences of polygon coverages, cellular automata, agent-based- 
models, events and transaction, and multidimensional data) and related scienti- 
fic questions. 

Despite that humans have keen ability to discover patterns hidden in 
small-scale data; they may find it difficult for large-scale data that often vary 
over both space and time. Researchers have made great effort on spatial data 
mining and spatio-temporal visual analytics to raise the cognitive ceilings 
which often prevent the interpretation of large spatio-temporal datasets (Guo et 
al., 2006; Shaw, Yu, and Bombom, 2008; Andrienko et al., 2010). 

In the Mobile Age, with the widespread use of location-awareness 
devices, it is possible to collect large-scale location-awareness datasets, such 
as mobile phone call data, GPS-enabled taxi trajectories, and social media 
data, to sense complex human movements and human-environment inter- 
actions. It would be of great significance to explore and understand how cities 
function in short-term temporal scales compared with traditional long-term 
strategic planning in the new era of Big Data (Batty, 2013). 

For example, although the human movements and activities may vary over 
time across different regions, the observed activity hotspots and information 
flow might exhibit a pattern of spatial dependence. Also, ignoring the temporal 
dimension would not be sufficient to discover underlying urban dynamics. 

For instance, urban governors might hope to monitor human movements 
by observing the neighboring regions in previous time periods. In such space- 
time integration contexts, the spatio-temporal analytics should help to answer 
questions such as: 
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Where are the spatio-temporal hotspots of human activities? 
Do human/vehicle movement flows exhibit different spatio-temporal 
patterns in contrast to overall trips? 

e How do these patterns relate to the population distribution and urban 
land-use structure? 

e How to conduct the spatio-temporal autocorrelation analysis? 
What is the impact of spatio-temporal granularity and uncertainty? 


To cope with these research questions and problems, spatio-temporal data 
mining techniques and workflows need to be studied. However, mining big 
geo-data and discovering knowledge of spatial-temporal relations, patterns and 
trends about the real world process are not trivial. Issues on spatio-temporal 
data organization, computation and integration with visual representation and 
human cognition still challenge researchers to make great efforts with 
interdisciplinary knowledge. To this end, this paper aims at proposing a spatio- 
temporal data processing and analytical framework which can be applied not 
only in exploring dynamic mobility and intra-urban flow patterns, but also in 
other human and social science research from the emerging big geospatial 
datasets and computer techniques. 

This paper is structured as follows: In Section 2, we will briefly discuss 
some related work and propose a spatio-temporal analytical framework which 
includes spatio-temporal visualization (STV), space-time kernel density 
estimation (STKDE), and spatio-temporal autocorrelation-analysis (STAA) for 
exploring human mobility and urban dynamic patterns. The methodology, 
technical implementation of these analytics will be presented in detail. Then 
we apply the framework to analyze amounts of geo-referenced mobile phone 
call records in a city to reveal the spatio-temporal patterns hidden in such big 
geo-data and further, to understand the complex urban dynamics. 

The data processing, experiments, main findings and discussions are 
presented in Section 3 and 4. We conclude the paper with summarization and 
directions for further research in Section 5. 


2. SPATIO-TEMPORAL ANALYTICAL FRAMEWORK 


Modelling human mobility patterns and understanding dynamic urban 
structures based on a large amount of GPS sensors, mobile devices, persons, 
vehicles, and street networks have become a hot topic in many fields such as 
urban planning, transportation, GIScience and computer science (Jiang and 
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Claramunt, 2004; Ratti et al., 2006; Gonzalez et al., 2008; Kang et al., 2012a, 
2012b; Liu et al., 2012; Yuan et al., 2012; Gao et al., 2013b, 2013c; Shen et 
al., 2013). In general, the mining and analyzing processes of such spatio- 
temporal data require combined qualitative-quantitative approaches which 
involve data extraction and analytics, statistical inference and geovisualization. 
We present a spatio-temporal analytical framework (Figure 1) which combines 
STV, STKDE and STAA for understanding spatial-temporal patterns (both 
individual and aggregated) hidden in the Big Geo-Data. Each of them has 
different characteristics and data-format requirements. In the processing, the 
raw data were converted into different data structures for various analytical 
purposes. In the following part, we will discuss the roles of different spatio- 
temporal analytics for the presented research. 


2.1. Spatio-Temporal Visualization Techniques for Trajectory 
and Flow 


By utilizing the power of human vision, previous studies have demonstra- 
ted the effectiveness of geovisualization in spatial data exploration and know- 
ledge discovery (MacEachren and Kraak, 2001; Kwan, 2004; Guo et al. 2005; 
Andrienko et al., 2008). 


Data Filtering and Extraction -> Processing ST Analytics -> Geovisualization 


Figure 1. A spatio-temporal analytical framework for identifying human mobility 
patterns and urban dynamics. 
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The understanding of urban spatial structures can benefit the studies of 
visualizing individual space-time behaviors (Kwan, 2000; Chai, 2013). We can 
use both individual-based visualization and aggregation-based visualization to 
explore dynamic patterns in urban studies. However, the representation of 
dynamic human activities and movements over both space and time is one of 
the major challenges in geocomputation and geovisualization. Hagerstrand’s 
time-geography conceptual framework provides an excellent integrated repre- 
sentation of human movements in space and time (Hägerstrand, 1970). 

But the space-time cube idea was not applied so widely until the develop- 
ment of GIS-based implementations and analytical discussions about space- 
time relationships, interactions and uncertainties (Miller, 2005; Shaw, Yu, and 
Bombom, 2008; Chen et al., 2011; Nakaya, 2013) moved forward, as well as 
the opportunity to explore potential human activities in both physical and 
virtual spaces (Yu and Shaw, 2008). 

For the individual-based movement representation, a space-time path (3D 
polyline) was created to connect time-ordered sequence of locations of one 
person in a 3D-GIS environment which consists of a two-dimensional horizon- 
tal geographic plane and one vertical dimension of time (Figure 2). 


Space 


Figure 2. Space-time path of an individual’s movement in a week. 
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It can be used for visual exploration of continuous spatial and temporal 
movement patterns. Several analytical models and measurements have been 
developed by Miller (2005) such as space-time prism, composite path-prisms, 
stations, bundling and intersections to further analyze the complex spatio- 
temporal relationships among human activities and interactions under special 
space-time constraints. But when numerous of trajectories were collected in 
the datasets, it was hard to visualize and interpret. 

Different approaches for generalization and aggregation of massive move- 
ment data have been introduced such as traffic-oriented view and trajectory- 
oriented view (Andrienko and Andrienko, 2008). 

In the urban context, the aggregation of massive human movement 
trajectories by origins and destinations (OD) can be utilized to understand the 
dynamic OD-flow patterns among traffic analysis zones (TAZs) or other poly- 
gonal divisions of region in different temporal scales. Traditional flow 
mapping is used for representing the amount and the direction (with arrow 
symbol) of from-to movements of human or things among regions in a 2D 
space, such as migration and goods trade (Tobler, 1987). 

Some graph layout optimization algorithms and aggregation strategies 
have been suggested to minimize the edge crossings between flow symbols 
(Phan et al., 2005; Andrienko and Andrienko, 2011). Here we introduce 
another approach of using vertical Bézier curves in 3D-GIS environment for 
interactive visual exploration of information or movement flows between 
places. The main advantages of such an approach lie in the integration of 3D 
visualization techniques which support interactions between 3D geometry 
objects and OD-flow values in multiple time snapshots or in a continuous 
animation. 

A Bézier curve is defined by a set of control points Po through P,,, where n 
represents called its order (n = 1 for linear, n = 2 for quadratic which is used in 
our work, etc.). Bézier curves have been widely used in computer graphics and 
geometry designs (Farin, 1996). We develop an algorithm to approximate the 
quadratic Bézier curves. 

As shown in Figure 3, the first point Py and the last point P3 are used to 
represent the centroids of two regions (i.e., the origin and the destination of 
each flow) and the intermediate control points are interpolated by the standard 
Bézier functions. We then project a flow between regions in the 3D-GIS 
environment. We write a Python script to generate all Bézier curve controlling 
points based on the OD-flow matrix table and link them in Esri’s ArcScene 
software for further interactive exploration of information flow or physical 
movement flow patterns. 
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Vertical 
Bézier Curves 
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Figure 3. Drawing and visualizing vertical Bézier curves in 3D-GIS environment 
(ArcScene). 


2.2. Space-Time Kernel Density Estimation 


As discussed above, the temporal information of movements in geogra- 
phic space is important to detect the spatio-temporal trends of underlying 
human mobility. But with the increasing number of aggregated human/vehicle 
trajectories in urban space, the space-time path representation model will be 
hard to interpret because of the overlapping and cluttering issues. To solve this 
problem, an extension of kernel density estimation (KDE) (Silverman, 1986) 
was suggested. The KDE has been widely used in spatial analysis to charac- 
terize a smooth density surface that shows the geographic clustering of point 
or line features in 2D space. In order to incorporate the time information, the 
space-time kernel density estimation (STKDE) can be taken as a generali- 
zation approach of the 2D-space KDE into the 3D space-time cube which can 
support the exploration of spatio-temporal patterns, clusters and changes. Such 
STKDE techniques have been used in several studies, such as crime clustering 
analysis (Brunsdon et al., 2007; Nakaya and Yano, 2010), trajectory data 
mining (Demšar and Virrantaus, 2010), publication citation analysis (Gao et 
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al., 2013a), and dengue fever pattern discovery (Delmelle et al., 2014). The 
STKDE value of each voxel (volumetric pixel) in the three-dimensional space- 
time cube is estimated as: 


1 X-X, y-Y, t-t, 
D(x, y,t) = —— ? K, E, OK, : 
Goden 2 T R 


S S h, (1) 


where D (x, y, t) is the density estimation of each voxel based on the data in 
neighboring volumetric pixels; n is the number of point events, and h, and h, 
are the spatial and temporal neighboring bandwidths. Each point in the 
neighboring pixels is weighted based on the proximity in both space and time 
to the voxel using kernel functions (K, and K;,). In this study, the Epanechnikov 
kernel is used for multivariate probability density estimation within the band- 
widths (Epanechnikov, 1969). 

Similar to the 2D spatial KDE, larger bandwidths may result in smooth 
surface while smaller bandwidths may result in the lack of trending patterns, 
so we need to calibrate both spatial and temporal bandwidths of STKDE based 
on the experiments with actual datasets. 

The results of STKDE are volume data, i.e., 3D-grids. Direct visualization 
of such STKDE would require four-dimensional space because of their 
volumetric data structure consisting of 2D geographic space, time and another 
one for the density estimation scalar. Such volume visualization is not very 
common in GIS but very popular in geophysics, geology, medical science, and 
in computer graphics (Kaufman, 1990). The three main approaches for volume 
visualization were discussed by Demšar and Virrantaus (2010): (1) direct 
volume rendering by assigning color and transparency to voxels; (2) isosurface 
that is the equivalent of isoline connecting points of equal value on a two- 
dimensional map; and (3) volume slicing by planes. We apply the volume 
slicing approach with color schema and transparency to the voxels regarding 
the consistency of KDE visualization in GIS. 


2.3. Spatio-Temporal Autocorrelation Analysis 


Analyzing spatio-temporal autocorrelation structures of human activities 
would be helpful to understand the urban dynamic patterns in space and time 
simultaneously. In statistics, autocorrelation can be taken as the correlation of 
a variable with a lagged specification of itself (Box et al., 2008). 
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The temporal autocorrelation can be defined as the correlation of the same 
variable X between values at different time s and t. 


R(s,t) = 


E(X, — w(X, - )] 
A 2) 


while E is the expected value operator, ¿4 is the mean of the observation 


values and øg° is the variance. The temporal autocorrelation can be used to 
explore the time-series autocorrelation patterns. 

With regard to the spatial dependence, spatial autocorrelation (association) 
statistics have been used to analyze the degree of dependency among 
observations in a geographic space (Cliff and Ord, 1973). These measurements 
can be divided into two categories: global indices and local indices. Classic 
global indices of spatial autocorrelation include Moran’s I (1950), Geary’s C 
(1954), and Getis-Ord’s General G (1992), while local indices of spatial 
association (LISA) can be established by transforming the global indices into 
corresponding local measurements based on different measures of similarity 
(Anselin, 1995). All of these spatial autocorrelation statistics require a spatial 
weights matrix that reflects the intensity of the geographic relationship 
between observations and their neighbors, e.g., the distance-to-neighbor matrix 
or the binary matrix in which the element value is O or 1 determined by 
whether there is a shared boundary between the observation location and 
neighbors. As suggested by Hardisty and Klippel (2010), adding the temporal 
neighbors into the weights matrix would be one approach to extend the 
traditional spatial autocorrelation measurements. 

Here, we present three extended global measures of spatio-temporal 
association regarding the spatial version of Moran’s I, Geary’s C, and Getis- 
Ord’s General G: 


i=l j=l d (3) 
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i=1 j=l 
G, = NN 
> zz, (t+) 
i=l j=l (5) 
where I es ‘ome and G; can be taken as different formats of space-time cross- 


correlation (or cross-product) models (Getis, 1991); Z is the target variable of 
interest; i and j are indices of total N spatial units; w, is an element of the k- 
order-neighbor spatial weighted matrix a“, ont. Se A k”); z, and a. are the 


means of variable Z within a time lag, while O, and O,,,are the variances. 


The local measures of spatio-temporal autocorrelation can be derived by 
decomposing a global measure into particular spatial neighboring units. 

In the experiment section, we will evaluate how different spatial and 
temporal neighbors (lags) affect the results of three spatio-temporal autocorre- 
lation measures for the mobile phone call activities in a city. 


3. DATA PROCESSING 


In this research, the dataset contains a week of about 74, 000, 000 anony- 
mized mobile phone call detail records (CDR) in a large city from a Chinese 
telecommunication operating company. The CDR data lists the information of 
caller, receiver, mobile base stations, date, time, duration et al. (Table 1). 

As shown in Figure 4, every time when a user (caller/receiver) made a 
call, he/she was geo-referenced to a corresponding mobile base station that has 
a unique longitude/latitude position. The coverage area of each mobile base 
station can be expressed as a Voronoi polygon for call activity analysis and 
termed as a “cell”. 

In this Voronoi partition, all phone calls within a given polygon are closer 
to the corresponding mobile base station than any other station. 

Generally, urban central regions have a higher density of mobile cells (the 
coverage area of each cell is smaller) than the outer suburb regions. 

Different data subsets have been extracted and processed for various 
research purposes such as human mobility modelling (Kang et al., 2012b), 
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travel behaviour studies (Yuan et al., 2012), population estimation (Kang et 
al., 2012a), and urban community structure analysis (Gao et al., 2013b). 


Table 1. Data format of mobile phone call detail records 


: s ; Durati B 
Caller |Receiver Date Start Time |End Time ann ue Long Lat 
(seconds) |Station 


Serve 


Nubl oppNub1 2007-07-23 |09:25:10 [09:28:20 |190 A 127.495 |50.243 
Serve 
Nubl oppNub2 2007-07-23|12:15:32  |12:15:52 |20 B 127.502 |50.241 


Legend 
© Mobile Base Station ***** Actual Movement Path 
0 25 5 10 15 20 
Kilometers ome Call Interaction = Approximate Movement Flow 


Figure 4. Spatial distribution of mobile base stations and an illustration of phone call 
interactions among cells. 


In this work, on one hand, for individual spatio-temporal mobility pattern 
analysis, the corresponding approximate movement trajectories of each mobile 
subscriber were created by connecting his/her a series of geo-referenced call 
records in space and time. 
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On the other hand, for urban dynamic structure analysis, phone call activi- 
ties and phone users’ physical movement flows were aggregated based on the 
Voronoi polygons. 

More detailed spatio-temporal analysis and results will be presented in the 
following section. 


4. EXPERIMENTS 


4.1. Individual Spatial-Temporal Movement Patterns 


In urban studies, the understanding of human movement patterns over 
time is very important in transportation planning and city management. Figure 
5 illustrates the 3D STV of a person’s mobile tracking trajectories in space 
over one-week. Each color represents a single mobile subscriber. Different 
shapes of 3D polylines reveal different space-time behaviors for specified 
users. For instance, the regular patterns at fixed locations (e.g. home and job 
places) with respect to the daily/weekly cycles can be indicated by the length 
of vertical segments of the space-time paths (Figure 5a). In addition, irregular 
movement patterns have been also found for certain type of individuals 
(Figure 5b). One might guess that the user is a delivery employee or working 
for related professions. The STV technique can help to simultaneously 
visualize the mobile trajectory patterns in space and time with an intuitive 
manner but may need the statistical analysis to extract more meaningful 
personal places of interests (POIs). Figure 6a displays the spatial distribution 
of a mobile user’s call-activity movement in a week. It tells that this person 
has a higher probability (Cell A: 0.31, Cell B: 0.22; Cell C: 0. 14) to 
frequently visit a few places, while the visiting probability of most cells is less 
than 0.05). The interpretation of the time series graphs of these frequently 
visited places by the user offers the possibility to identify the personal POIs. 
As shown in Figure 6b, the user often made phone-calls at Cell A and Cell B 
during 19:00~24:00 throughout the whole week. 

One can infer that his/her home might locate nearby or inside the 
boundary of the two cells. Also, the temporal signature of Cell C may indicate 
a working place since the user present there only during office hours 
(9:00~18:00) on weekdays and not on weekends. However, the detailed 
information about personal characterized POIs may need more investigation 
on daily-activity trips of the mobile subscribers or the study of geographical 
contexts. 
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In order to identify the spatio-temporal hot-spots of individual phone call 
locations, we calculated the space-time kernel density estimation (STKDE) of 
each user’s call activities. Figure 7a shows an individual’s mobile phone call 
activities over a week in a space-time cube. The STKDE of this user’s call 
activities was calculated in a spatial resolution of 500 meters and a temporal 
resolution of 100 minutes. The different combinations of spatio-temporal 
bandwidths result in various visual representations. 


a b 


Figure 5. Space-time visualization of mobile phone users’ trajectories: (a) regular 
movement patterns of three individuals; (b) irregular movement pattern of an 
individual. 


Figure 6. (Continued). 
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Cell B 
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Figure 6. Spatial distribution and temporal signatures of individual frequent visited 
mobile cells: (a) Spatial display of cell visiting probability for a user (the larger circle 
represents higher probability); (b) The time series graphs show temporal signatures 
(visiting frequency) of the user’s frequent visited places (mobile cells) over a week. 
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Figure 7. Visualizing georeferenced phone call activities and space-time density: (a) 
phone call events in space-time cube; (b) STKDE results for a specified mobile user in 
a week. 


The selection of bandwidths needs several rounds of calibration by 
adjusting both spatial and temporal bandwidths to find an optimization which 
can help to uncover hidden patterns. We used twice the spatial resolution 
(1km) as the spatial bandwidth for both horizontal dimensions and 500 minu- 
tes as the temporal bandwidth for the vertical dimension. 

The resulting density volume of 50*50*100 voxels for a user’s call 
activities is visualized in Figure 7b. 

One may find that this user is more likely to make calls in certain fixed 
locations (red and orange voxels) across time. This example shows that it is 
much easier to visually identify the spatio-temporal hot-spot patterns of call 
activities by using the STKDE approach. 


4.2. Aggregated Phone-Call Interaction Patterns 


One of the prominent characteristics of phone call data lies in its 
indication of human communications and spatial interactions among different 
places. The aggregated phone call interactions among mobile cells in different 
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time represent the dynamic intra-urban communication landscape which 
cannot be captured using other traditional activity survey data. As shown in 
Figure 8, the vertical Bézier curves were created to represent the hourly 
phone-call flow across cells. The height of the arcs represents the relative 
volume of phone calls. The tall and narrow arcs show strong call communica- 
tion within a nearby intra-urban space; while some long distance curves across 
non-adjacent cells indicate strong call interactions among these spatially sepa- 
rated regions. 
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Figure 8. Vertical Bézier curve 3D visualization of hourly phone-call flow patterns 
across cells in a day. (Green dots are locations of mobile base stations; each arc 
represents an OD flow linking two mobile cells). 


Such information flow patterns are strongly related daily human activity 
rhythms and working-social connections, as well as geographical contexts 
including urban land-use types and spatial distributions of home-job locations 
(Ratti et al., 2006; Gao et al., 2013b). 

In order to understand the dynamic “source-sink” structures of informa- 
tion landscape (Liu et al., 2012), we calculated the phone call net-balance-flow 
for each mobile cell by subtracting the outgoing call volume from the 
incoming call volume in each hour. Figure 9 shows the time-series plot of 
phone-call net flow among all cells. Each line represents the net flow pattern 
for a specific Voronoi mobile coverage area. In Figure 10, it is clear to see 
dynamic spatial distributions of the “source” areas (red color) which have 
more outgoing phone calls and the “sink” areas (blue color) which have more 
incoming phone calls in different hours. The yellow cells mean that the net- 
balance call flow in that hour is zero. One can interactively interpret the call 
flow patterns in the 3D-GIS environment or sense the dynamic urban phone- 
call landscape under the animation mode. 


4.3. The Spatio-Temporal Autocorrelation Patterns 
of Phone Calls 


The study of spatio-temporal autocorrelation structure of mobile phone 
calls in urban space can help to understand the citizens’ mobile communica- 
tion patterns and urban structures. In order to investigate how the spatial auto- 
correlation structure changes throughout the day in this city, the phone-call 
volume was aggregated into the Voronoi cells by hour at first. 
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Then, the Moran’s I local indicator of spatial association (LISA) (Anselin, 
1995) was calculated for each cell in every three hours (See Figure 11). The 
value of Moran’s I has been standardized to lie in [-1, 1]. If the index is larger 
than 0, the cell shows positive spatial autocorrelation with its neighbors; while 
it indicates the negative spatial autocorrelation of phone call patterns if the 
index is smaller than 0. The closer the value is approaching to 1 (or -1), the 
stronger the positive (or negative) spatial autocorrelation is. Examining the 
spatial structure of LISA in different time periods, we can clearly see that the 
spatial autocorrelation patterns of phone calls across all cells are very dynamic 
and heterogeneous. 

The central region (small cells) shows more diverse patterns than the outer 
suburb areas, where most spatially adjacent cells show similar values in the 
whole day. 

It might reflect the mixture land-use types of urban central areas and 
human’s convergence and divergence in this place with various phone call 
behaviors in different time periods. 

To identify a more stable autocorrelation structure, we apply the spatial 
Statistic test of running 10000 simulations of randomized permutations of 
neighboring cells to find the local significant spatial autocorrelation patterns 
(Anselin, 1995). 
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Figure 9. Time-series plot of phone-call net flow among all mobile cells (each line 
represents a different mobile cell). 
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As shown in Figure 12, we generated both Moran's scatter plot and spatial 
distribution of the labelled mobile cells with IDs to visually and interactively 


identify the statistically significant local association cells (with a 0.05 signifi- 
cance level). 


Figure 10. (Continued). 
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Figure 10. Visualizing the dynamic phone-call “source” areas (red), “sink” areas (blue) 
and “‘zero-balance” areas (yellow) in 3D-GIS environment. 


By comparing the phone call volume of each cell with its 1°-order 
adjacent neighbors, there are four types of associations identified: (1) HH: 
observations in both the target cell and neighbors are high; (2) HL: high call 
volume in the target cell with low volume in neighbors; (3) LH: low call 
volume in the target cell with high volume in neighbors; (4) LL: the call 
volume in both the target cell and neighbors are low. Figure 12a illustrates the 
results of significant Moran’s I LISA in the period 3AM~6AM. 
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Figure 11. Local Moran’s indicators of spatial autocorrelation (LISA) analysis on 
phone calls in different time periods of a day (using the 1°-order-neighbor spatial 
weighted matrix). 


There are continuous significant LL spatial associations in the urban 
central region and HH association structure in the southwest and northeast 
suburb regions which contain large residential housing referring to the Google 
Earth imagery’ in this city. But such spatial autocorrelation structure changes 
over time, for instance, the central area tends to have mixture patterns of all 
HH, HL, LH, LL association types in the period 9AM~12PM (See Figure 
12b). 

The spatial weighted matrix plays an important role in spatial autocorrela- 
tion analysis (Getis and Aldstadt, 2010). As shown in Figure 13, a larger-order 
of spatial adjacency tends to have larger number of neighbors. 

Another important factor for identifying spatio-temporal autocorrelation 
structure is time granularity (e.g., half an hour, per hour, two hours and 
others). It inspired us to examine how the different combinations of spatial 
weights and temporal neighbors affect the STAA results. 

Using the methodology introduced in Section 2.3, we implement the 
global Moran’s I like statistic of STAA with different spatial lags and time 
lags for hourly phone-call patterns across all cells. Examining the results 
reveals two key findings (See Figure 14a). First, the strength of global 
Moran’s I like spatio-temporal autocorrelation measure (L) for hourly phone 
calls is temporally dynamic and there is a positive-association peak between 6 
AM~7AM. Second, the J,; measure is more sensitive to the spatial order than 
the temporal neighbors. A higher-order of spatial weights generally results in 
higher strength of spatio-temporal autocorrelation structure. 


! The imagery with labels is not shown here as required by the data provider. 
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Figure 12. Moran's scatter plot and spatial display of mobile cells with statistically 
significant local associations. 
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Figure 13. The 1“, 2" and 3™-order of spatial adjacency matrix (a dot means the weight 
between the two cells is 1; otherwise is 0) and the corresponding distributions of 
neighbors for each mobile cell. 


In addition, we implement the global Geary’s C like STAA measure (C,,) 
and Getis-Ord’s G like STAA measure (Gy) to identify the spatio-temporal 
autocorrelation of hourly phone-call patterns in different hours. Note that the 
Cs statistic indicates a positive autocorrelation structure when the value lie in 
(O~1); while it is a negative autocorrelation when the value lie in (1~2). 

It is found that the hourly autocorrelation trends of Cst measures are more 
similar to the J,, measures (Figure 14b). But using the G,, measure didn’t reveal 
the temporal dynamics of autocorrelation strength in our datasets. The Gy 
Statistic is also sensitive to the spatial-order of weighted matrix (Figure 14c). 
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Figure 14. Three global measures of spatio-temporal association with different combi- 
nations of spatial weights (spatial orders) and temporal neighbors (1 time-lag: 1 hour; 2 
time-lag: 2 hours; 3 time-lag: 3 hours) for hourly phone-call patterns: (a) Is, (b) Cs; 
and (c) Gx measures, 


CONCLUSION AND FUTURE WORK 


In this paper, we introduce a spatio-temporal analytical framework for 
exploring human mobility patterns and urban dynamics with the help of GIS. 
The integration of spatial-temporal visualization, space-time density estima- 
tion and spatio-temporal autocorrelation analysis can not only help to represent 
spatio-temporal data visually and interactively but also offer quantitative 
analytics to identify the spatio-temporal patterns (such as spatio-temporal 
hotspots) in the mobile phone data. Our experiments have demonstrated that 
different spatio-temporal techniques have their potential advantages but also 
limitations. For instance, space-time path model is good for visual exploration 
or overview of individual regular (or irregular) movement patterns but might 
not be suitable for massive trajectories because of overlapping and cluttering 
problems in the space-time cube. The study also demonstrates that the user’s 
“home” and “working places” can be inferred based on statistical information 
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of visited mobile cells with the characteristics of personal temporal signatures 
on these places. In addition, we implement the STKDE to analyze the conti- 
nuous spatio-temporal density structure for the georeferenced phone call 
activities and this technique can facilitate the identification of spatio-temporal 
patterns. Furthermore, to understand the phone-call spatial interaction 
structure, we introduce a novel 3D flow visualization approach of generating 
vertical Bézier curves in the 3D-GIS environment which also support 
interactive analysis of spatial information and thematic attributes. 

Moreover, we have investigated different statistical measures (Is, Cs, and 
Gs) which extended the classic spatial association indices for the spatio- 
temporal autocorrelation analysis. 

The spatial order of weighted matrix was found to have more significant 
effects than the temporal neighbors on influencing the autocorrelation strength 
of hourly phone-call volume across the whole study area. 

The spatio-temporal analytical framework introduced in this paper can be 
also applied in other spatio-temporal datasets (e.g., infectious diseases, crimes, 
and GPS tracks) for facilitating knowledge discovery and decision support in 
urban informatics and social sciences. 
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ABSTRACT 


Crop Protection is besides agricultural engineering, plant breeding 
and fertilization an indispensable part of modern agriculture. Still, even 
though not substitutable, there is a downside. The use of pesticides and 
therefore the application of chemicals into the environment bears 
substantial risks for both nature as well as human beings. Geographical 
Information Systems (GIS) can help making crop protection more 
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sustainable. This chapter describes two possible examples of how GIS is 
used by the Central Institute for Decision Support Systems (DSS) in Crop 
Protection (German acronym ZEPP) to support farmers in Germany with 
their pesticide applications. 

The first example describes a DSS that creates machine readable 
application maps using a web based GIS application. Application maps 
offer the possibility to automate the pesticide spraying process. The maps 
created by the DSS include legal buffer zones to water bodies and 
protected terrestric structures, e.g. hedges, where spraying of pesticides is 
prohibited. Provided that a tractor with Global Navigation Satellite 
System (GNSS) and a pesticide sprayer with section control is available, 
an automated application is possible. Once the sprayer moves into an area 
of the field that is a buffer zone, the respective section is switched of 
automatically. The DSS helps farmers to comply with legal rules and to 
prevent the contamination of the environment. 

The second example describes how GIS is used in pest forecast 
systems. With the help of GIS it is possible to obtain results with higher 
accuracy for disease and pest simulation models. Pest forecast systems 
developed by ZEPP use GIS to interpolate geographical factors like 
temperature and relative humidity such getting meteorological data for 
every km? in Germany. The interpolated data and the parameter 
precipitation, taken by radar measured precipitation data are used as input 
for the simulation models. The output of these models is presented as 
spatial risk maps in which areas of maximum risk of the disease outbreak, 
infection pressure or pest appearances are displayed. The modern 
presentation methods of GIS lead to an easy interpretation and further- 
more promote the use of the system by farmers. 


INTRODUCTION 


Agriculture is the targeted production of plant or animal products. It 
serves primarily for food production and therefore is of highest importance to 
human society. Agriculture nowadays has reached a very high level of techno- 
logy and efficiency. This is necessary, because in a world with an ever 
growing number of inhabitants and an equally growing demand for quantity as 
well as quality of agricultural products a big emphasis has to be put on getting 
the most out of what is available. This is especially important since the amount 
of arable land stays the same or is even declining because of climate change 


and other issues (Oerke et al. 1999). 
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Unfortunately but as well inevitably, the development of useful plants and 
their intensive cultivation have triggered an evolution of pathogens like plant 
diseases and pest attacks. 

Several times such pathogens have had an deep impact on human 
development. One example is the outbreak of late blight in Ireland between 
1845 and 1849. Four years of bad to no harvest in potatoes caused the death of 
500.000 Irishmen due to famine and triggered the emigration of 1,6 Million 
more to the US. 

Crop protection tries to fight such pest attacks to assure yield amounts stay 
on the necessary level. Besides agricultural engineering, plant breeding and 
fertilization it has become an indispensable part of modern agriculture. 

Experts guess, that without any type of intelligent crop protection, 
agricultural yields would on average be around 50 % lower. The fact that we 
have not seen hunger crisis in the western world since many years is due to a 
highly efficient agriculture that includes crop protection measures. 

Still, even though not substitutable, there is a downside to crop protection. 
The use of pesticides and therefore the application of chemicals into the 
environment bears substantial risks for both nature as well as human beings. 

Spraying helps to avoid plant diseases and pest attacks but the number of 
treatments is often not optimal adjusted to the appearance of noxious 
organisms. Achieving the optimal efficiency is difficult. This often leads to 
negative economic and environmental impacts for farmers and the environ- 
ment. 

To minimize the negative effects of crop protection, several preconditions 
have to be met. On the one hand there have to be strong regulations in place on 
how, when and where to use pesticides or how, when and where not to use 
them. Additionally Decision Support Systems (DSS) have to be set in place to 
allow smarter decisions e.g. when and where to use pesticides. 

In Germany forecasting models and advice by the government are used in 
the planning of spraying, taking economic and ecological aspects in account. 
The prediction of the occurrence and the prevention of plant diseases and pest 
attacks is an important component of integrated pest management. 

Since location plays a crucial role in agriculture GIS-Systems can play an 
important role here. They can support sustainable crop protection by helping to 
get better answers to questions like: 


e Where should be applied which amount of pesticide to achieve 
optimal results and to comply with legal regulations? 
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e When is the optimal date for certain crop protection measures in a 
specific location? 


The mission of the ZEPP is to develop, collect and examine existing fore- 
casting and simulation models for important agricultural and horticultural 
pests and diseases and to adapt these models for practical use. 

GIS is an integral part of these models. In the following two examples for 
how GIS is used in DSS shall be shown. 


1. CREATING MACHINE READABLE 
APPLICATION MAPS FOR CROP PROTECTION 


Expectations towards the use of pesticides in Germany are high. This is 
necessary to sustain public acceptance of modern agriculture. Farmers need an 
efficient data management because complying with rules and requirements 
regarding planning, application and documentation of pesticide measures 
causes a high level of information density. 

To prevent deposition of pesticides into water bodies as well as other 
damage to the environment several laws about legal buffer zones to rivers, etc. 
apply. The instructions coming with each pesticide explain the product 
specific buffer zones. Certain pesticides require e.g. to keep a distance of 20 
meters to water bodies if the field slopes more than 2%. Besides that, legal 
buffer zones depend as well on the application technique. The drift reduction 
class of spray nozzles is the important point here. Additional there are specific 
laws in each German state that oblige buffer zones to water bodies and if a 
district does not have an adequate amount of small landscape features (e.g. 
hedges) buffer zones to such structures apply, too. 

Because of these factors and the necessary documentation proper crop 
protection is challenging for farmers who want to achieve an optimal and 
correct implementation. Considering changing agricultural machines, drivers 
and work regions for agricultural service supply agencies these challenges are 
even bigger (Scheiber and Kleinhenz 2013b). In agricultural day-to-day reality 
the planning and implementation of crop protection measures as well as the 
compliance with laws, rules and any sort of documentation are mostly due to 
the responsibility of the operator who is conducting the action. Much of this 
work is still done manually and without the support of information technology 
which results in high workloads as well as an increased error-proneness. 
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GIS systems can help to automate and therefore optimize the processes 
mentioned above. 

One option is to automate the spraying process and the protection of 
adjacent natural and aquatic ecosystems by using GIS-created application 
maps, that include legal buffer zones, where spraying is prohibited. These 
maps can then be aplied to sprayer terminals. 

ZEPP and ISIP have, in cooperation with different project partners’, 
developed an internet-based DSS that creates such application maps (Scheiber 
et al. 2012, Scheiber and Kleinhenz 2013a-d). Figure 1 describes the under- 
lying concept: 

The DSS consists of six steps, in which data from the farmer as well as 
public information and geodata are integrated (Figure 2). Each step will be 
described in detail in the following: 


Step 1: GNSS Survey of Field Geometries, Water Bodies and 
Terrestric Structures 


The process starts with the mapping of field geometries and sensitive 
landscape areas adjacent to the field. Geodata about water bodies and terrestric 
structures like hedges or skirts of the forest are necessary to be able to 
calculate the required buffer zones. 

A technical procedure how to conduct such surveys has been developed. A 
GNSS-RTK based approach is promoted, which reaches an accuracy of up to 
just a few centimeters. The procedure allows the farmer to map the applicable 
landscape elements during a tractor ride using an off-set method. It has been 
developed in cooperation with German supervising authorities to promote that 
data recorded this way are officially accepted. 


Data Integration Application Map 


Figure 1. Concept of creating application maps. 


! http://www.igreen-projekt.de. 
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Figure 2. Decision Support System. 


Step 2: Data Input Via Farm Management Information System 
(FMIS) or Web Interface 


To allow field specific advice, data from the farmer are necessary. This 
includes information about cultivated crop, geographic coordinates of the field 
(e.g. from Step 1) or spray nozzle used (drift reduction class). Data input can 
either be done using a direct connection to a FMIS system (e.g. Landdata 
Eurosoft or HELM Software) or via a web interface on www.isip.de. Figure 3 
shows an example for a web interface. 


Step 3: Calculation of Buffer-Zones Based on Legal Regulations 


In a third step zones inside the field are being identified, in which pestici- 
de application is not allowed under the conditions of the specific pesticide 
application. The output is a machine readable application map. The following 
factors are included in the process: 


e Pesticide specific buffer zones to water bodies or other landscape 
structures deserving protection based on information from the pestici- 
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de database of the German Federal Office of Consumer Protection and 
Food Safety (Bundesamt fiir Verbraucherschutz und Lebensmittel- 
sicherheit 2013) 

e Buffer zones that arise from the slope of a field (e.g. >2%) 

e Buffer zones that arise from spray nozzles used (drift reduction class) 

e Buffer zones to water bodies depending on which German state the 
field is in 

e Buffer zones to terrestric structures deserving protection based on the 
index of small landscape features by the the Julius Kiihn-Institut 
(JKI), Federal Research Centre for Cultivated Plants 


The calculation of the buffer zones is carried out by an online-GIS- 
application. Within the scope of a complex geoprocessing service information 
and geodata from the sources mentioned above are intersected to identify 
zones in the field where pesticide spraying is prohibited. The result is a map 
that defines application zones and legal buffer zones (Figure 1). The farmer 
has the possibility to edit this map. 
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putea me 
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Figure 3. Example for a web interface. 
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Step 4: Creation of the Machine-Readable Application Map 


The application map is provided using the non-proprietary ISO-XML 
format (ISO 11783-10 2009) which can be applied to terminals of different 
manufacturers. The file format ISO-XML is becoming more and more 
established in agricultural engineering. 

Figure 4 shows examples for crop protection tasks on terminals of John 
Deere and the Competence Center ISOBUS e.V. (CCI). 


Figure 4. Application Tasks on a Terminal of John Deere and CCI. 
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Step 5: Application 


Provided that a tractor with GNSS and a pesticide sprayer with section 
control is available, an automated application is possible. Once the sprayer 
moves into an area of the field that is a buffer zone, the respective section is 
switched of automatically (Figure 5). 


Step 6: Documentation 


Modern terminals are able to record data about pesticide applications. This 
considerably facilitates the documentation process. The protocol file can be 
used as justification towards public authorities or purchasers. The compliance 
with legal buffer zones can be proven. Furthermore the information generated 
can be used for consecutive treatments. 

In summary using the DSS brings several benefits for farmers: 


e  Oberservance of legal buffer zones 

e — Facilitation of a proper pesticide application 

e Cost optimization due to automated section control 

e — Environmentally sound and sustainable use of pesticides 
e Automated documentation 


2. USE OF GEOGRAPHIC INFORMATION 
SYSTEMS IN PEST FORECAST SYSTEMS” 


As described in the introduction, the main mission of ZEPP is to develop, 
collect and examine existing forecasting and simulation models for important 
agricultural and horticultural pests and diseases and to adapt these models for 
practical use. More than 40 weather-based forecasting models for pests and 
diseases have been successfully developed within the last years. 

The occurrence of diseases/pests and periods of high-intensity attacks can 
be calculated with high accuracy. The forecast models are based on different 
concepts. 


? The information in this subchapter is based on Racca P. et al. 2011. 
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Figure 5. Crop Protection Measure. 


These range from simple temperature sum models to complex population 
matrices with integrated rate based algorithms to calculate growth, reproduc- 
tion and distribution of noxious organisms. 

DSS are employed for the 


e estimation of disease/pest risk 

e estimation of the necessity for pesticide treatments 

e forecast of the optimal timing for field assessments 

e forecast of the optimal timing for pesticide treatments 
e — recommendation of appropriate pesticides 


Results of DSS are distributed to the farmers via warning services, using 
different transmission media (bulletins, letters, faxes and telephone answering 
machines) and via the internet platform www.isip.de (Information System for 
Integrated Plant Production) (Röhrig and Sander, 2004). The predictions are 
suitable for integrated as well as organic farming. 

Meteorological data as well as assessed field data are needed as input for 
DSSs. With these input data the decision support systems calculate an output 
result, e.g. the date of the first appearance of a pest. 

The meteorological data in Germany are provided on the one hand by the 
German meteorological service, on the other hand some federal states in 
Germany built up their own meteorological networks. 

At the moment data of 148 stations of the German meteorological service 
and 417 stations owned by the Governmental Crop Protection Services 
(GCPS) of federal states are available. In sum these are data of 565 stations 
which can be used to run decision support systems. 

However, in some agricultural areas, the distance between meteorological 
stations (MS) exceeds 60 km. 
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Forecast models did not give satisfactory results for fields separated by 
such large distances to MSs (Zeuner, 2007). With the help of Geographic 
Information Systems (GIS) a plot-specific classification of temperature and 
relative humidity has been developed using complex statistical interpolation 
methods described by Zeuner (2007). The method, however, cannot be applied 
to the parameter precipitation. 

Especially in the case of frequent spatially and temporally limited rainfall 
(so-called convective rainfall event), the interpolation for precipitation does 
not give plausible results (Zeuner and Kleinhenz, 2007, 2008, 2009). 

To overcome this restriction, precipitation data is not interpolated but 
obtained from radar measurements with a high spatial resolution. 

Using these spatial input parameters for the currently available disease 
forecast models leads to accurate forecasting for areas in-between two or more 
distant MSs. With the use of GIS, daily spatial risk maps for diseases and pests 
can be created in which the spatial and the temporal process of first appearan- 
ce and regional development are documented (Figure 6). These risk maps lead 
to improved control and a reduction in pesticide use. 
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Figure 6. Scheme of process to calculate risk maps using GIS. 
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2.1. Storage 


In order to store the results of interpolation, a grid was laid out over 
Germany. At present, the GCPS use about 565 MSs to represent an agri- 
cultural area of aprox. 200.000 km’, or an average of one MS per 350 km’. 
With the new GIS method, grid cells have a size of 1 km? and, after inter- 
polation, are represented by virtual MSs (Liebig and Mummenthey, 2002). 


2.2. Spatial Data of Temperature and Relative Humidity 


For the interpolation of temperature and relative humidity the multiple 
regression method was chosen because it gave the best results by the shortest 
calculation time of all tested interpolation methods. The first calculations with 
the four interpolation methods (Inverse Distance Weighted, Spline, Kriging 
and Multiple Regression) showed that deterministic interpolation methods 
were not suitable. The general purpose of multiple regressions (the term was 
first used by Pearson, 1908) is to learn more about the relationship between 
several independent or predictor variables and a dependent or criterion 
variable. MR is an interpolation method that allows simultaneous testing and 
modelling of multiple independent variables (Cohen, et al., 2003). 

Parameters that have an influence on temperature and relative humidity, 
e.g. elevation, slope, aspect, can therefore be tested simultaneously. MR uses 
matrix multiplication and only variables with a defined minimum influence 
that will be included into the model. The result of MR is a formula (x = const 
+ Al*constl + A2*const2+ A3*const3+...+ Ax*const) which allows a 
calculation of a parameter set for each grid cell from which independent 
variables are known (Zeuner, 2007). 

To validate the results of the interpolation, 13 MSs were ignored in the 
interpolation process. After interpolation, the deviation between calculated 
values and measured data of these stations were compared. The study was 
conducted from January to August in the years 2003 to 2006. For all stations, 
MR gave results with highest accuracy (Table 1). In all cases, the coefficient 
of determination (CoD) ranged between 96 and 99% for temperature and 92 
and 96% for relative humidity, respectively. For the 13 MSs, the mean 
deviation for temperature was less than 0.1°C and for relative humidity less 
than 0.6% as calculated with MR. The absolute maximum and minimum for 
temperature was less than 4.7°C and for relative humidity less than 32.6%. 
The data also were tested for significance between calculated and measured 


GIS Applications in Modern Crop Protection 109 


data using a t-test. The test indicated that for all stations the differences 
between the calculated and measured values were random. The MR method 
gave plausible results, so it was chosen to interpolate the meteorological data 
to be used as input for the forecasting models. 


2.3. Spatial Precipitation Data 


16 radar stations are run by the German meteorological service to record 
precipitation all over Germany. These stations do not measure the amount of 
precipitation at ground level but the signal reflected from the rain drops in the 
atmosphere. These measurements at first only allowed calculation of an un- 
specific ‘precipitation intensity’, a shortcoming. With the system RADOLAN 
intensity is now calibrated online with data from a comprehensive network of 
ombrometers, using complex mathematic algorithms. As a result the amount of 
precipitation can be provided in a spatial resolution of 1 km? (Bartels, 2006). 
These calibrated amounts of precipitation based on radar measured rainfall 
intensities are referred to as “radar data” in the following. The validation of 
precipitation data took place in intensely used agricultural areas, joining the 
radar grid with stations of the meteorological network. In this way, it was 
possible to relate each station to a grid cell. 

The radar derived precipitation at the station’s grid cell and the actually 
measured data formed the basis for the statistical verification. Since rain 
events differ throughout the year, two representative months (May and August 
2007) were selected to analyse uniform rainfalls in spring as well as convecti- 
ve rainfall events in summer. This resulted in a validation dataset of 1488 
hours for each MS. 


Table 1. Validation of data on temperature and relative humidity; 
deviation between calculated values and measured data with MR 


veut temperature [°C] relative humidity [%] 
2003 |2004 |2005 |2006 |2003 |2004 |2005 |2006 

CoD 96% 96% |99% 98% 94% 96% |95% 92% 

mean dev. 0.0 0.0 0.0 0.1 0.3 0.1 0.1 -0.6 


maximum 4.4 4.1 4.3 4.7 19.6 32.6 |21.6 21.2 
minimum = |-3.8 -4.5 |-4.5 -4.1 -18.9 |-21.9 |-22.8 |-22.8 
t-test n.s. n.s. n.s. n.s. n.s. n.s. n.s. n.s. 


n = 92160 hours, n.s. = not significant. 
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Depending on the region, the number of MSs ranged from 9 to 29. In 
addition, the influence of the distance between radar station and MSs was 
analysed. 

Furthermore, a leaf wetness simulation model used by ZEPP (Racca, 
2001, unpublished) was run on data from both methods of precipitation 
measurement and the results were compared. 

The parameters for the amount of precipitation, number of hours with 
precipitation and calculated leaf wetness showed high correlations between 
radar values and measured data. The maximum of the hourly deviation of the 
amount of precipitation was 0.06 mm. In hours with rainfall the deviation was 
slightly higher (0.36 mm). No correlation could be detected for the distance 
between radar stations and MSs. 

For hourly rainfall pattern, a correlation of 91.4% between stations and 
validation areas was measured. The best correlations were obtained for the leaf 
wetness model for which values > 99.9% were achieved. 

The results clearly show that the use of radar data as an input parameter in 
disease forecast models is valid. By adding data of temperature and relative 
humidity with high spatial resolution, an optimal basis for plot-specific 
forecasts has been established. 

Moreover, this system allows the exact detection of local convective 
rainfall events, which at the moment often remain undetected using individual 
MSs. Significant improvements of the spatial forecasting by plant disease 
simulation models can be expected from the use of radar data. 


2.4. Introducing Spatial Risk Maps into Practice (www.isip.de) 


ISIP, the Information System for Integrated Plant production www. isip.de, 
is a Germany-wide online decision support system. It has been initiated in 
2001 by the German Crop Protection Services as a common portal, thus 
achieving synergies by pooling existing information. 

Target groups are farmers as well as advisors. 

Since information transfer is the primary task of extension services, the 
system is intended to make this work more efficient by using modern informa- 
tion technology. Therefore a bi-directional data flow between the services and 
the farmers was developed. By combining general with specific data, 
recommendations can be refined from regional to individual. 

The information is primarily distributed via HTML pages, thus a browser 
is necessary to use the system (Röhrig and Sander, 2004). 
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In 2010 a new way of presenting results of prognosis models for plant 
pests and diseases has been implemented. Using interpolated meteorological 
data in a high spatial resolution as input parameters, so-called ‘risk maps’ are 
drawn (Figure 7). These maps have several advantages compared to results 
representing point information based on single MSs: 


e Risk maps are more suitable to identify hot spots and ease the inter- 
pretation of the model’s results. 

e The user does not have to choose a specific MS, which might even not 
be valid for his plant production site. 

e The maps are produced conform to the OGC standards, thus can be 
used in other systems. 


In addition to the GIS functionalities of zooming and panning, it is 
possible to scroll through the maps of the last ten days. This gives an excellent 
overview of the temporal development of the pest or disease risk. 

The system is supplemented by a spatial three-day weather forecast 
offered by the German Meteorological Service. 
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Figure 7. Risk map of the German Federal State of Lower Saxony for Potato Late 
blight in September 2013. Shown are the infection pressures in five classes (very low 
[Sehr niedrig] to very high [Sehr hoch]) and the respective spraying intervals in days. 
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It is expected that this further supports the decision and management 
processes of the farmer. 

In a summary it can be concluded that pest forecast systems need plausible 
and complete meteorological data as main input. In Germany meteorological 
data are mainly provided by the German meteorological service. Additionally 
several German states built up their own meteorological networks. 

However by using meteorological data of MSs a good prognosis is only 
reached in the scope of a MS. That is the reason why the ZEPP developed a 
new technology based on GIS. With the help of GIS it is possible to obtain 
results with higher accuracy for disease and pest simulation models. 

The influence of geographical factors on temperature and relative humidi- 
ty were interpolated with GIS methods getting meteorological data for every 
km? in Germany. The parameter precipitation was taken by radar measured 
precipitation data and the results of all measured meteorological data were 
used as input for the simulation models. 

The output of these models is presented as spatial risk maps in which 
areas of maximum risk of the disease outbreak, infection pressure or pest 
appearances are displayed. 

The modern presentation methods of GIS lead to an easy interpretation 
and will furthermore promote the use of the system by farmers. 


CONCLUSION 


The two examples mentioned above show that the use of GIS can support 
crop protection measures in different ways and make them more sustainable. 

GIS helps in planning pesticide applications, answering the question if 
pesticide spraying in a specific location is appropriate at all, because the 
weather conditions have been favorable for a certain plant disease or pest. 
Thus the pesticide application can be optimized under economic and ecologic 
aspects. GIS helps as well in the actual spraying process by assuring that zones 
like water bodies that deserve protection are automatically taken into account. 
GIS-created application maps automate the spraying process. 

When a protected zone is entered the sprayer automatically stops the 
pesticide application. Water bodies or hedges are not contaminated with 
chemicals. 
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Chapter 5 


GIS APPLICATIONS IN PRACTICE: 
EXPLORING SPATIAL DYNAMIC 
OF TRANSPORT ACTIVITIES 


Tiebei Li (Terry)* 
Urban Research Program, Griffith University, Australia 


ABSTRACT 


The spatial pattern of urban transport activities has become a focus of 
recent academic enquiry and planning policy concerns. This is largely 
driven by the rapid urban growth and increased transport pressure in 
major international cities and the demand for improved transport 
infrastructure and services. This article focuses on the application of 
Geographical Information Systems (GIS) techniques in exploring 
geographic patterns of major urban transport activities at both urban and 
regional scales. The first part in this article develops GIS methods to 
analyse geographical pattern of commuting transport at a regional level. 
The methodology uses multiple spatial O-D transport data at regional 
geographical units and applies disaggregated spatial techniques to 
identify spatial patterns of commuting distance and traffic flow and the 
changes in these patterns over time. The second part of the article 
demonstrates the application of GIS techniques in exploring tempo- 
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spatial patterns of public bicycle trips in urban areas. A GIS technique 
called flow map is developed to explore tempo-spatial patterns of public 
bicycle under different calendar event and climatic conditions. The paper 
demonstrates how the results from the GIS techniques may form part of 
an evidence base for the transport planner with the potential to inform 
future development to enhance the efficiency, and how the application of 
GIS techniques will enhance the planner’s toolbox whilst responding the 
transport planning issues. 


1. INTRODUCTION: 
TRANSPORT ANALYSIS AT DIFFERENT SCALES 


In transport studies, transport-related activities and processes can be 
analysed across a whole range of scales. This may include regional, local and 
individual levels. The scale of transport study and analysis is closely related to 
the transport phenomenon under investigation and the questions being posed 
about it. In general, the transport structure can be broken down into regional, 
local and micro levels. 


Regional Level 


The system-wide transport analysis can be applied to the regional or 
national level. At this level, regional researchers and strategic planners account 
for the general transport interactions of people in large regions across space, 
for example, national migration, freight movement, and inter-state flow and 
interactions. Regional transport analysis is fundamental in many strategic 
transport planning and policy issues (Miller, 1998). The major outputs of 
analysis at this level include the regional economic interactions and population 
migration, the transport infrastructure, the processes between industries and 
interregional trade, and transport systems. 

Since the 1950s and 1960s, various geographers and economists have 
developed a suite of analytical approaches. They are typically used to model 
structural relationships and spatial interactions between the regional 
economies as well as regional mobility of population. Some research methods 
are based on the macro-economic theory. Examples are regional economic- 
base analysis (Tiebout, 1962) and multi-regional input-output analysis 
(Leontief, 1986). These are able to resolve a degree of regional interactions 
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and relationships between the regional economies. Gravity-based model 
(Lowry, 1964) is one of the most widely-used methods at a regional level for 
the analysis of the regional economic interactions and transport. Gravity-based 
models explicitly model and predict the spatial interactions in a region's (or a 
nation’s) economy. Regional planning uses spatial interaction analysis to 
forecast trends in migration, employment and capital flows. The spatial 
division of regional spatial interaction typically uses economic regions to 
reflect regional economic and transport performance. 


Metropolitan Level 


Research at the metropolitan-level scale focuses on the transport processes 
of smaller spatial units (e.g. traffic zones). At this intermediate level, the 
processes focus on the actions of main elements of the urban transport 
systems, such as, the interactions and behaviours of the institutes or the local 
residential/economic communities of a region. The problems under 
investigation at this level include transport activities to various functional 
destinations such as travel to work, school, shops and recreational activities. 
The transport problems typically focus on the geography of transport zones, 
inter-zonal transportation, transport patterns and associated energy 
consumption (Dodson et al., 2007) and performance of transport networks 
including road and public transport systems. In addition, congestion is also a 
common question at city level around the world and solution must be sought to 
alleviate these problems. 

Essentially the transport interactions at metropolitan level have an 
inherent spatial component comprising area of transport zones (e.g. origin 
zone and a destination zone), and interactions and movements between 
transport zones. Developing an understanding of the basic relationship of the 
transport interactions, and its variation according to the type of activities, and 
transport mode are central to the search for solutions to the problem. 
Geographical Information Systems (GISs) have a potentially large role to play 
in the analysis and visualization of transport data permitting more in-depth 
studies to be undertaken. Especially its ability to integrate multiple data 
sources such as census statistics, road networks and their associated capacities 
permits more advanced measures of movement between traffic zones to be 
incorporated. 

Theories and research methods have been developed at this level since 
about the 1960s. The transport analysis based on the micro-economic theory or 
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location theory (Alonso, 1964) have been widely used to analyse the transport 
demand and route choice between trip origin and destinations at metropolitan 
areas (for example, choosing a path with lowest travel cost or shortest 
distance). The GIS analysis at this level typically focuses on the inter-zonal 
interactions. For example, spatial interaction models specify an overall 
governing relationship for flow between locations. The powerful visualization 
and data manipulation capabilities of GIS permit transport data to be visually 
explored to uncover spatial patterns and trends. GIS techniques that are 
capable of tracking changes in the spatial dimension of transport behaviour 
can help identify the ever varying nature of travel demand. 


Individual Level 


The aggregate analytical approaches are not recommended for analysing 
the more complex transport processes at the very disaggregate level. This is 
because the greater diversity of behaviours that exist at the lower levels of 
observation need a variety of social behavioural theories to explain, rather than 
using a simple spatial interaction formulation. At the micro-level scale, 
research typically focuses on the fundamental units of human behaviour (an 
individual, a household). This includes individual social/behavioural 
geography, travel patterns and location and route decision choices and trip 
chain (activity) patterns. Micro-level analysis became a major area of transport 
geography because there was a greater variety of human behaviours and an 
increased level of spatial variability in the urban transport systems. 

The micro-level modelling perspective represents the activity-based 
analysis at the highest possible level of disaggregation. It studies the 
emergence of complex patterns from behaviour and the interactions at the 
individual level — e.g. the location choice for an individual person. Such a 
description of an individual’s behaviour is often referred to as microeconomic 
theory and discrete choice theory (McFadden, 1978). The actors in micro-level 
analysis can be an individual or a household. The research methods for 
transport activity at the micro-level are based on the theories and concepts of 
behavioural geography (Golledge and Stimson, 1997). For example, micro- 
simulation or agent based modelling, are used to simulate the transport choice 
and processes at the level of the individual actors. Time geography 
(Hagerstrand, 1970)approach describes how an individual’s travel behaviour 
and route choice vary according to the trip purpose, time, and the location 
attributes of the neighbourhoods (Kwan, 1998). In comparison to spatial 
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aggregate analysis, these local approaches have shown potential in explaining 
the social and economic behaviour at the individual level. Hence, they can 
provide a detailed pattern of urban transport at the most disaggregated level. 
Nevertheless, micro-level analyses are not suggested for modelling spatial 
variations for a large geographical area. This is because of the considerable 
modelling complexity and demands for micro-scale data. 

The geography of urban transport in essence consists of the study of 
people’s location of activities coupled with the route and distance travelled 
between those activities. The application of GIS to study the spatial dynamics 
of commuting has been applied in a variety of instances. Discussions on how 
to properly use GIS-based methods are relatively sparse in transport studies. 
This article focuses on the application of GIS techniques in exploring 
geographic patterns of major urban transport activities at various scales. 
Although not all GIS techniques presented in this article are new, they 
demonstrate how those GIS techniques can be tailored for transport studies at 
different level of scales to better address scale specific transport questions. The 
purpose of doing that is to contribute to the discussion and theory of linking 
GIS with a wide range of transport research and applications. The first part in 
this article develops GIS methods to analyse geographical pattern of 
commuting transport at a regional level. The methodology uses multiple 
spatial O-D transport data at regional geographical units and applies 
disaggregated spatial techniques to identify spatial patterns of commuting 
distance and traffic flow and the changes in these patterns over time. The 
second part of the article demonstrates the application of GIS techniques in 
exploring tempo-spatial patterns of public bicycle trips in urban areas. A GIS 
technique called flow map is developed that explores tempo-spatial patterns of 
public bicycle under different calendar event and climatic conditions. The 
paper demonstrates how the results from the GIS techniques may form part of 
an evidence base for the transport planners with the potential to inform future 
development to enhance uptake of this new urban transport mode, and how the 
application of GIS techniques will enhance the planner’s toolbox whilst 
responding the transport planning issues. 


2. GIS FOR REGIONAL TRANSPORT ANALYSIS 


Investigate transport dynamic for a large region is important giving 
increasing challenge for regional transport, some key transport issues are 
associated with low access to transport, lengthy travel distance and increased 
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transport energy cost. Such transport issues have received increasing attention 
in transport studies for a large regions. For instance, in popular journey to 
work studies, developing an understanding of the broad relationship between a 
worker’s residence to their locale of work and commuting behaviours in a 
given region, its variation over space and time and how this differs according 
to industry sector and travel mode are central to the search for solutions to 
transport problems. 


2.1. The Need for GIS 


GIS techniques that are capable of tracking changes in the spatial 
dimension of regional mobility can help identify the ever varying nature of 
transport demand. The geographical dimension of journey to work activities 
has received significant attention over time (see for example Mogridge, 1979; 
O’Connor, 1978, 1980; Gipps et al., 1997; Horner and Murray, 2003; O’Kelly 
et al., 2005; Titheridge and Hall, 2006; Sakanishi, 2006; Sultana and Weber, 
2007; Mees et al., 2008). The application of GIS to study the spatial dynamics 
of commuting has been applied in a variety of instances. An early work 
provided by Mogridge (1979) who used the Euclidean distance analysis 
between traffic zones to investigate journey to work lengths in London. This is 
followed by Wachs et al., (1993) who utilized GIS-based techniques to track 
changes in wokers’ home and work locations in California. Other GIS 
applications include Christopher et al., (1995) spatially explored the direction 
of travels for 9 counties in Chicago. Results indicated minor change in 
directional biases over the 20 year study period despite significant regional 
growth and urban decentralisation. The more recent GIS operations was given 
by Vandersmissen et al., (2003) to analyse changes in worker travel time and 
distance for Québec City using disaggregate household travel survey data. 
Horner (2007) investigated urban form and transport change in Tallahasse, 
Florida using both global and local measures of transport change and 
relationship between land use and transport patterns. At a regional level, many 
of these studies have also used higher levels of aggregation in census and 
travel survey data. 

This section demonstrates a series of GIS applications that it is possible to 
model regional transport patterns, even at a disaggregate scale. Their capacity 
to assess detailed transport flows and the use of transport networks, and 
manipulate the aggregate transport data into a meaningful geo-visualization. In 
this section, the journey to work datasets were used to conduct GIS analysis of 
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transport dynamics at a regional level. Two JTW datasets for South East 
Queensland (SEQ) region were obtained from the Australian Census for 1996 
and 2006. Specifically, the compiled JTW matrices for 1996 and 2006 for the 
SEQ region contained 300 origin zones and 300 destination zones (using 
suburbs) covering geographic region of Brisbane, Gold Coast and Sunshine 
Coast. There are three types of information contained in the JTW data: GIS 
coverages for both the origin and destination zones for all trips, and the JTW 
(origin-destination) matrix. The JTW matrix simply comprises one column and 
one row, specifically a destination code and an origin code and the total 
number of people travelling between each origin and destination. 


2.2. Modelling Travel Distance 


Firstly, the travel distance of JTW was modelled using disaggregated GIS 
network analysis. One spatial issue of aggregated transport analysis is the 
geographical unit of transport zones (suburb) are relatively large (especially 
suburbs for regional areas), the measure of JTW distance between the centroid 
of each suburb was not appropriate to represent the multiple route choices for 
all commuters within the suburb. Therefore, we randomly generated 10 points 
within each suburb and each point was used as the single departing location 
and arrival location of the travel. The use of this method permits more 
advanced measures of movement of commuters on the road between multiple 
home locations and workplaces. Then the point-to-point based travel distances 
were summarized within the suburb to give an average suburb-suburb travel 
distance. The suburb-suburb commuting distance was then multiplied by the 
number of commuting trips between each origin-destination pair (provided by 
JTW data), and the total travel distance for all commuting trips for every 
single surburb (including all destinations) was calculated. Then the average 
commuting distance for each suburb was obtained based on the total number 
of commuters in a suburb. In order to identify variations in peoples’ 
commuting behaviour for local areas, the commuting distance is calculated 
based on suburb of residence (trip origins). The Queensland road network data 
was applied to calculate the network travel distance between origin and 
destination. 

Next, the same GIS modelling procedure was applied to the 2006 JTW 
data in order to compare the change in commuting pattern over time. However, 
a major difficulty in the analysis of changes in transport is the changes in 
travel zones of JTW data over census years. As such, this has raised a crucial 
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analytical issue of how these independent zonal structures can be spatially 
integrated into a single, consistent set of geographical units. To overcome this 
problem, an area interpolation technique was applied to transform the spatial 
data from the zones supplied by the ABS to a new set of spatial units which 
are consistent between 1996 and 2006. We used 2006 suburbs as the new 
destination zones (consistent to the origin zones) and applied an areal 
weighted proportioning method to transform the spatial data. The data 
estimation for the new destination zones was based on the degree of spatial 
overlap with known data for the previous destination zones. Although the 
concept of the technique is fundamental, it proved challenging as the areal 
calculation for the large spatial matrix data was computationally intensive. 

Finally, the results of the average commuting distance across suburbs for 
year 1996 and year 2006 are provided in Figure 3 (a) and (b). This shows that 
the commuting distance tends to be shorter among workers who live closer to 
central city areas (e.g. Brisbane metropolitan area, Gold Coast and 
Toowoomba City), whilst longer commutes (mainly cross-suburban travel) 
tend to be for workers in the middle and outer suburbs. Therefore, the regional 
trend is that the travel distance increases as residences are separated further 
from the city centre. The further one’s home is from a city centre, the longer 
one’s commute tends to be. 

The result also reveals the local difference in average commuting distance 
between 1996 and 2006. In general, there was minor change in average JTW 
distance between 1996 (15.75 km) and 2006 (15.95 km). There were an 
increased number of commuters travelling shorter distances to work (less than 
10 km) by 2006, but the distance of travel for commuters in the middle range 
(e.g. 10 to 30 km) slightly increased. The number of commuters with very long 
commutes (30 km or more) remained stable between 1996 and 2006. By 
comparing Figure 3 (a) and (b), the map shows that over time a decrease in 
average commute distance occurred at the Sunshine Coast (far north to 
Brisbane), Brisbane’s west, and the Gold Coast suburbs. The possible reasons 
include fast urban growth and employment relocation, which have introduced 
increasing numbers of employment opportunities into these areas. People 
living in these areas tend to find work locally, travelling relatively shorter 
distances. In addition, areas with an increased commuting distance were 
observed at some outer-urban and regional areas. Tentatively, the increase in 
long commuting could be driven by economic restructuring, and new residents 
in emerging peri-urban locations who are often reliant on employment well 
outside their local area, perhaps explaining increasing general commute 
distances. 
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Figure 1. GIS mapping of travel distance for year 1996 and year 2006. 
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2.3. Modelling Traffic Flow 


In this section, the GIS-based network modelling was used to model the 
distribution of traffic flows over the road networks. 

First, we calculated the shortest route for commuting travel between each 
origin and destination, assigning all trips to the shortest routes possible. All 
shortest paths carrying traffic flows were then overlapped with suburb 
boundaries, and total commuting flows were summed for each individual 
suburb based on the total number of commuters passing through, originating, 
and ending in that suburb. 

Next, because the spatial unit for suburbs is too coarse to represent the 
spatial distribution of the commuting flows, we transformed the total number 
of commuting flows from suburbs to Ikm by Ikm grid cells across the study 
area. A binary dasymetric mapping method (Langford and Fisher, 1996) was 
applied to spatially disaggregate the data. The binary dasymetric method 
assumes that the number of travellers is uniformly distributed inside some part 
of a suburb (in this case, the grid cells that intersect with the road networks) 
and the remaining parts of the zone (non-road areas) necessarily have a zero 
commuting flow value. 

The maps of spatially disaggregated commuter flows for year 1996 and 
2006 are illustrated by Figure 6 (a) and (b). At the regional level, the highest 
commuting flows are concentrated in the central Brisbane area, extending 
north and south through the transport networks. The 1996 data showed the 
commuting flow stretching to the southern areas; and this tendency was found 
to be more significant in 2006. The increased commuting flow across the 
north-south corridor may have been caused by rapid urban expansion and new 
population settlements, especially in the southern suburbs of Brisbane, which 
generated increased commuting trips towards Brisbane. A similar pattern of 
commuting development was also found in the Gold Coast City area, where 
commutes spread considerably towards the north and south areas of the city 
along the coast. 

A comparison between travel flows between 1996 and 2006 demonstrated 
that the central Brisbane area has experienced the highest growth in 
commuting flows. This was not only driven by an increase in inbound 
commuters but also the increased cross-suburban traffic that passes through 
the central Brisbane area. The most significant growth in commuting was 
found along the transport corridor between Brisbane and Gold Coast. The 
major growth areas along this corridor involves some spatial clustering effect 
indicating that in addition to increased commuting interactions towards 
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Brisbane, some increased internal commuting has also developed in these 
areas. The change in commuting flows in the Brisbane’s west was not 
significant. 
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Figure 2. The distribution of travel flows for year 1996 and year 2006 (by 1k by 1k 
grid cells). 


3. GIS FOR LOCAL TRANSPORT ANALYSIS 


In the last section, we demonstrated a range of GIS techniques applied to 
model and explore the spatial dynamics of urban transport activities over a 
large region. In this section, we focus on the GIS analysis of spatial dynamics 
of transport activities at local scale using disaggregated Brisbane City Cycle 
travel data. In different to the motorized transport (e.g. private cars), the non- 
motorized transport such as walking and cycling appear to be route flexible 
and more sensitive to the external conditions (such as wether and temperature). 
Therefore, the transport patterns are considered more complex and dynamic 
over space and time. A more spatial-temporal dedicated technique is needed to 
capture the temporal-spatial dynamics. 
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3.1. The Need for GIS 


Technological improvements introduced in third generation public bicycle 
systems permit continuous monitoring of traffic flows of public bikes 
(Shaheen et al. 2010). As such, operators of public bicycle systems are able to 
gain access to real-time usage data of their networks. Collected via mobile 
devices or other ICT-based measures, this information can then improve our 
understanding of individual traveller's behaviour, offer real-time travel 
information, and also present personalised location-based services. Moreover, 
fine-grained data on the status of shared bicycles also enables an empirical 
measurement of the impacts of proposed system improvements or policy 
changes (e.g. fare restructuring) as well as the results of force majeure on the 
system (e.g. flooding). 

A number of studies have utilised stock or usage data to explore spatial 
and temporal patterns. For example, examining Barcelona's shared bicycling 
system ‘Bicing’, Froehlich, Neumann and Oliver (2009) investigated user 
behaviour across stations in relation to location, neighbourhood, and time of 
day. Still in the European context, Borgnat et al. (2009) predicted the number 
of bikes hired per hour in Lyon's community bicycle program to describe the 
daily and weekly patterns. The prediction method involved several explanatory 
factors such as the number of subscribed users, the time of the week, the 
occurrence of holidays or strikes, and weather parameters. However, methods 
to identify and visualise spatio-temporal patterns (i.e. location and times when 
frequency of bike use is particularly high) based on flow or trip data have not 
been adequately examined in past studies, particularly, within the Australian 
context. There is, therefore, an imperative need to better understand the 
location, time and reasons for these individual uses to inform strategies to 
ensure a more successful public bicycle implementation. 

The Brisbane’s CityCycle dataset was used to conduct GIS analysis of 
transport dynamics on urban areas. The CityCycle data contains trip level 
information in the form of an origin-destination matrix. There are total of 150 
CityCycle stations distributed in the Brisbane CBD and its immediate 
surrounding suburbs. The rows represent origin stations and the columns 
destination stations along with individual counts of transitions. Analysis of this 
type of dataset in raw form is not generally viable given that it consists of a 
large matrix of numbers with no geographic information included. This is 
particularly the case in multivariate situations where the difference in origin- 
destination matrices conditional on other variables (for example, hour of day) 
is of interest. The argument for the role of exploratory spatial data analysis has 
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been made convincingly elsewhere (see for example, Haining et al. 1998; 
Anselin 1999) and would be strong in our case with the CityCycle dataset, 
where no simple descriptive statistic can be easily defined that captures the 
complex dynamics of an origin-destination matrix. 


3.2. Using Flow Mapping 


The flow map is a well-established cartographic technique. It can assist in 
our understanding of these matrices by mapping transitions between spatial 
units. Here, lines depicting the transitions between spatial units are typically 
appended with arrows to indicate flow direction and the width of the line is 
used to indicate the volume of flow. While the flow map can be used to 
develop an understanding of origin-destination matrices, it does not readily 
allow the incorporation of other variables such as weather and calendar events 
as a component of the visual output. In such circumstances, the analysis of 
bivariate spatial data can be analysed using a technique termed the comap in 
which plots with overlapping subsets of data are selected using non-spatial 
variables. These plots of raw data can be used to give a sense of how spatial 
relationships change conditional on one or more external variables (for 
example, how bicycle trips vary spatially according to how windy it was at the 
time the particular trips were made). 

Flow mapping, a visual analytical tool to depict spatial interaction and 
movement, has a long history dating back to 1869 where Charles Minard first 
used the technique to depict Napoleon’s army’s advancement towards 
Moscow (Minard 1869). The development of computerised flow mapping 
tools, however, only commenced in the late 1980s with Waldo Tobler’s flow 
mapper program. This program was designed to visualise discrete node-to- 
node movements (Glennon & Goodchild 2005). Since then, a number of 
significant efforts, particularly in computer science, have been made to 
develop more advanced tools to visualise flow information in various 
standalone applications (see for example, Phan et al. 2005; Guo 2009; 
Boyandin et al. 2010; Boyandin et al. 2011). 

The flow map, on its own, only conveys pertinent information of an 
origin-destination matrix where the objective of the investigation is to study 
the matrix in isolation. In circumstances where there is a need to investigate 
the extent to which the origin-destination matrix changes as a function of other 
variables (for example, under certain weather conditions), another technique 
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must then be employed. In this section, the flow map technique was embedded 
into another spatial exploratory data analysis technique called the comap. 

The comap (Brunsdon 2001) and its older, more generic cousin, the coplot 
(Cleveland 1994), provide an effective means of visualising multivariate data 
in order to assist with uncovering previously hidden patterns embedded within 
complex data. The basic idea is that a bivariate subset of raw data is selected 
based upon some condition (for example, a defined range of rainfall or a 
categorical variable such as weekend or weekday). This data is then plotted 
either in raw form in a scatter plot panel or a mapped kernel density surface. 
The comap and coplot embrace the small multiple principle (Tufte 1991) by 
using multiple panes of similar and typically overlapping regions, in order to 
illustrate how gradual changes can be observed as a function of external 
variables. 

Embedding the flow map into the comap to form a combined technique 
that represents a straightforward extension of the two techniques which allows 
multivariate exploration of origin-destination data which otherwise would not 
be readily possible. 


3.3. CityCycle Flow Dynamics 


We examine the spatial flows of City Cycle over a 24-hour period and the 
effect of specific calendar events (i.e. weekdays, weekends, public and school 
holidays) on the trip patterns. The flow maps of CityCycle between Brisbane’s 
inner suburbs are displayed by Figure 3. The number of trips between suburbs 
represented using a variable line thickness, where the width of the line is 
proportional to the total flow volumes. Origin-destination pairs generating less 
than 200 trips in total were not mapped in order to preserve graphical clarity 
and highlight the main suburb-to-suburb interactions. The choropleth 
classification of suburbs represents the within-suburb flows (i.e. where the 
bicycle release and return are in the same suburb), with the darker colour 
representing the higher number of internal trips. 

As Figure 3 shows, a high proportion of trips taking place over relatively 
short distances, along with a high degree of interaction between adjacent 
suburbs. The number of trips appears to be higher (but not concentrated) 
between the Brisbane Central Business District (CBD) and the immediate 
surrounding suburbs, whilst the trips between suburban locations are lower. In 
regards to self-containment, a number of suburbs exhibit relatively high levels 
of within suburb trips. 
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Figure 3. Flow map for public bicycle trips (by hour of day and for specific calendar 
events). 


These results are expected given that each has high concentrations of 


urban amenities that include parks and river-side bike paths, and public 
transport links. Results also highlight that early morning trips (between 5am 
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and 10am) are more spatially dispersed. Later on in the morning and early 
afternoon (from between 10am and 2pm), trips tend to be more concentrated 
and spatially focussed around the CBD and the immediate surrounding 
suburbs. This spatial flow pattern continues until around 5pm where after 
(from 5pm to 10pm) trips begin to spread further into the suburbs but appear 
less dispersed than during the morning peak hours. There is also evidence that 
there are a relatively high proportion of self-contained trips that remain 
relatively constant across the 24-hour period at some suburbs. 

The effects of specific calendar events (i.e. weekdays, weekends, public 
and school holidays) and hour on trip patterns simultaneously using an array of 
flow maps. A number of specific observations can be made. First, CBD-based 
trips taking place during weekends, especially between the hours of 9am to 
5pm are shown to be markedly less concentrated that those occurring during 
workdays. Trips taking place during the evening (i.e. after Spm) show a 
significant reduction in weekdays that is not as marked during weekends. The 
effect of public holidays on the spatio-temporal patterns is very similar to 
weekend patterns. Trips occurring during school holidays differ very little 
from regular weekdays apart from a small reduction in the number of trips 
taking place between the suburbs and the CBD during peak hours. 


4. DISCUSSIONS 


Understanding the spatial patterns of transport activities have always gain 
increase in importance. As GIS techniques become more established in this 
area, they will enhance the analysis of transport data that seek to derive deeper 
understanding of the transport pattern and underlying spatial structure and 
dynamics. The development and application GIS-based techniques will 
supplement the planners’ toolbox. When the techniques presented in this paper 
are well developed into deployable solutions their added value to the urban 
and transport analysis can be fully evaluated. At this point, it would then be 
possible to better respond to the transport questions to inform the future 
development timely and geo-targeted policy; that could potentially enhance 
planning efficiency. 

The first part of this chapter has investigated the JTW dynamics in a large 
region based on spatial analysis of JTW data with a focus on geographic 
patterns of travel distance and travel flows. The use of JTW datasets is far 
from straightforward because of complexity of the data and changes in 
geography of traffic zones over time. Therefore, we utilised advanced GIS 
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techniques to spatial analysis with JTW data and present JTW data with 
different form better serving transport analysis. Firstly, we employed an area 
interpolation technique to transform two data sets (1996 and 2006 JTW 
matrices) into consistent geographical units that independent of the two 
original and inconsistent zonal structures. This method provides new 
opportunities to examine spatial and temporal changes in urban transport 
patterns. Secondly, we applied a GIS-based network analysis to compute the 
average travel distance between the origin-destination traffic zones. The 
method accounts for all possible routes between randomly generated points in 
order to give more advanced measures of effects of multiple residential 
locations, workplace and road choices on the resulting travel distance. The 
GIS network modelling is also used to model the distribution of traffic flows 
through road networks. The procedure calculated the number of trips travelling 
on every road (network link) to estimate a traffic map. The map is used to 
analyse the distribution of traffic flow over time and congestions. The result is 
spatially disaggregated at Ikm by | km grid cells, making them suitable to 
inform the transport analysis and policy making at local level (e.g. traffic 
noise, carbon emission and energy consumption at street block level). Both 
these GIS techniques are found to be very useful tools to model the spatial 
dynamics of commuting from the complex JTW datasets. 

The second part of the chapter has shown that at a disaggregated scale, a 
flow mapping technique was used to explore spatio-temporal dynamics of 
public bicycle. In most of the previous research into public bicycle data 
capturing, the stock data has been the predominant source of data to study their 
underlying dynamics. In this study, we have highlighted the utility of flow or 
trip-level data that offers new opportunities for research to examine the spatio- 
temporal dynamics of public bicycle. Developing an understanding of the 
complex spatio-temporal dynamics of public bicycle at a local scale is critical 
to compiling evidence base with the capacity to ensure the system is 
configured in a manner that meets the needs of the public bicycle users. The 
data necessary to establish such an evidence base often exists in the form of 
disaggregate trip-level records. The challenge now for transport planners and 
researchers is to draw upon existing data, and where necessary, develop new 
tools and techniques to examine these data in a manner that has the potential to 
improve the operation of public bicycle systems. This paper has attempted to 
progress research in this area through the development of a visual analytic, the 
flow map, to explore new insights into the association of bicycle trip patterns. 
Gaining a better understanding of these underlying dynamics is a first step to 
establishing a fully automated monitoring tool with the capacity to identify 
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expected movements of bicycles around the system, given certain temporal 
and environmental circumstances. The analysis presented here has extended 
our knowledge of public bicycle dynamics through adopting a GIS approach to 
examine disaggregate-level relationships between bicycle trips and their 
complex relationships with hour and calendar events. Whilst findings reported 
in this paper are important for transport planners, conducting such research 
could in turn lead to operational benefits for public bicycle including the way 
in which bicycles are distributed across the system on weekends and weekdays 
to ensure that supply and demand are met in the most optimal manner possible. 


CONCLUSION REMARKS 


Understanding urban transport pattern using spatial analysis will increase 
in importance. Two sets of GIS-based techniques were presented in this 
chapter each demonstrates how useful recognized GIS applications in practice 
to better understand questions on transport activities from the large regions to 
the smaller metropolitan areas. As GIS techniques and applications become 
more established, they will enhance the transport planning that seeks to derive 
a deeper understanding of the spatial structure and dynamics of transport 
systems under various conditions. 

The development, application and validation of the spatial disaggregation 
techniques will supplement the planners’ toolbox. When the GIS techniques 
presented in this chapter are well-developed into deployable solutions, their 
added value to the transport planning and research can be fully evaluated. At 
this point, it would then be possible to better respond to the transport questions 
to inform the future development of timely and geo-targeted policy; that could 
potentially enhance the deployment of public investment and enhancing the 
efficiency and reducing the costs. 
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Limited evidence has led to considerable debate about land routes 
and methods to move megaliths chosen for sculpture by prehistoric 
societies. The current research is investigating the question using 
Geographic Information System (GIS) to determine likely land pathways 
to transport megaliths over 100 kilometres in Mesoamerica by Preclassic 
Olmec society. Access was restricted and the terrain included floodplains, 
seasonal rivers and extensive swamps. Analyses were derived from 
digitised survey maps using slope gradient tools initially from 
ARCVIEW 3.2 and finally ARC10. Although compatibility issues arose 
with this combination as we describe, tools of both versions provided a 
starting point between stones’ source from which to then define a 
pathway across the challenging terrain. 
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INTRODUCTION 


The study by archaeologists of large stone or megalith transportation by 
prehistoric societies is often restricted by limited physical evidence. The 
known evidence will frequently show the stones were retrieved over long 
distances, across difficult and variable terrain. These considerations influence 
the transportation methods that could have been employed, the routes used and 
the time needed to complete these tasks. Most transportation efforts are 
affected by seasonal restrictions and manpower availability. 

These considerations impose the need to manage, collate and analyse 
extensive spatial data sets to establish starting points for the archaeological 
investigation. Further analysis may then be possible using other methodologies 
and information sources. These analyses would include replication 
experiments, ethnographic observations and environmental data linked by 
mathematical models. All megalith transportation efforts are constrained by 
slope gradient limitations, which define viable movement of stones. Analysis 
of environmental factors using Geographic Information Systems (GIS), 
incorporating human physiology capability and slope gradient analysis as a 
constraint allows viable transportation routes to be identified. This paper 
describes how we established a database, applied the GIS analysis, while 
identifying its limitations and compatibility concerns between early analysis 
and those that followed. 

The study of megalith transport in ancient societies provides an insight 
into various elements of these societies such as the necessary economies to 
support this activity and the relationship between the sculpture and political or 
hierarchical status of individuals within the society. 

Replication experiments are often limited to specific examples (Cyphers, 
2006, Richards and Whitby, 1997) and so require a methodology that can be 
used to synthesize relevant factors. Establishing how the transportation was 
done and where, is often the subject of considerable debate hence the use of 
slope gradient analysis to define both method and routes is important. 

In theory massive weights can be moved by manpower alone; however in 
reality these loads are limited by the hauling teams’ ability to co-ordinate their 
power. Richards’ experiment (Richards & Whitby, 1997) suggested in theory 
200 persons were needed for a 40 tonne stone to be moved uphill on a gradient 
of 1 in 20. In practice 130 people were used, while only 60 were needed for 
downhill hauling or control (Richards and Whitby, 1997). 

The experiment described by Richards & Whitby, (1997) concluded that 
progress of approximately 1 km per day on level ground can be expected. 
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These data, together with ethnographic sources (ArchaeoNews, 2004, Harmon, 
2005, Mladjov and Mladjov, 1999, Van Tilburg, 1995) and other sources 
(Royal Engineers, 1960, Royal Engineers, 1952) were used in conjunction 
with GIS slope analysis during this study. The following section describes the 
GIS research methodology associated with the megalith transport research. 
Analyses are detailed and conclusions outline its application to Olmec 
megalith transport and issues arising from this approach using GIS. 


OLMEC COLOSSAL HEADS AND THEIR 
MEGALITH TRANSPORT 


The Olmec are often referred to as the “Mother Culture” a title that is 
debated. This society held a sphere of influence some 200 kilometres long and 
80 kilometres wide (125 x 50 miles) known as the “heartland”, At its centre 
during a period between 1200-900BC known as the Preclassic was the San 
Lorenzo (SL) Plateau, the political hub of the period. Situated some 60 
kilometres (38 miles) from the Gulf of Mexico, the SL Plateau is a partly man 
made ridge around 1200 metres long and rising some 45 metres above the 
extensive floodplains and swamplands, characteristic of the Rio Coatzacoalcos 
Basin. Its position and elevation make the plateau a dominant feature over this 
area, underwritten by the agriculturally productive floodplains that supported 
the general population and its hierarchy comprising artisans and rulers. 


Photo Leslie C. Hazell. 


San Lorenzo Head 1; Xalapa Museum of Anthropology. 
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Of the eighteen known Colossal Heads attributed to the Olmec, ten heads 
were found on the San Lorenzo Plateau and these weigh between six tonnes 
and 25 tonnes. They vary in height between one and a half metres to nearly 
three metres. Their circumference varies from just over three metres up to 
nearly six metres (Clewlow, et al., 1967). Their source is accepted as being 
from or near Cerro Cintepec in the foothills of the Tuxtla Mountains (Williams 
and Heizer, 1965). The straight line distance is around 80 kilometres or 50 
miles with swamps, flood plains and many rivers that must be crossed. 

At least one head is believed to have been a reused altar stone (Porter, 
1989). The heads are noteworthy for their broad noses, short faced styles and 
having a flat backed form. The latter observation may be a clue to the method 
of transport as striations are visible, and such marks could be caused by direct 
contact with the ground, conversely these may be created by sculptors 
(Clewlow, et al., 1967). 

A study using and testing various parameters essential for viable transport 
concluded that water routes would not be viable due to environmental, 
watercraft and crew capability limitations (Hazell, 2013, Hazell, 2011). These 
uncertainties indicate the difficulties of conflicting evidence and the need for 
robust data analysis of transport routes and methods. Many other smaller 
stones were moved and used in other sculptures, but the size and mass of the 
Heads makes their retrieval over such a distance and challenging terrain 
logistically complex. The GIS analyses, as shown in the various figures, 
indicated viable corridors that would avoid most river crossings, swamps and 
adverse gradients. Further analysis suggested that land transport was viable by 
using a direct contact dragging process as soil bearing capacity of dominant 
soils was adequate even in the vicinity of floodplains (Hazell, 2011). The 
known technology of the Olmec indicated their willingness and capacity to 
construct causeways to overcome problems associated with the floodplains 
(Cyphers, 1997). 


MEGALITHIC TRANSPORT BY LAND 


While slope analyses formed an important part of the investigation, 
identifying viable land routes had to include the avoidance of wide or fast 
flowing rivers, flood plains and swamps. Gradient is a major constraint when 
hauling megaliths uphill or while maintaining control during descents, as 
ethnographic records and replication experiments clearly describe (Dillon, 
2004, Heyerdahl, 1958, Richards and Whitby, 1997). Arguably, significant 
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labour sources influence transport pathways, so the location and size of 
villages may have also contributed to route choice. Some routes appear 
possible but seasonal variations change their viability. 

Floodplains, swamps and river positions could all change over millennia. 
Using GIS, allowed us to establish a viable database of terrain features that 
included signs of a feature's past position, as in the example of ox bow lakes or 
lagoons. So it is possible to interpret past landscape features and take these 
into account during the analysis process. 

This investigation was a theoretical exercise using GIS technology as an 
analytical and management research tool. It was limited as later comments will 
highlight. 

In modern times, hauling large stones over long distances would be a 
mechanised process. Therefore our expectations of what was possible, in terms 
of capability, time frames and commitment, could be vastly different from 
those of prehistoric societies. With a limited archaeological record, our 
database and GIS software, allows interpretative analyses and comparative 
testing of potential scenarios to provide testable outcomes on the question of 
land transport routes. 


ESTABLISHING THE DATABASE 


1:250000 and 1:50000 survey maps were sourced to establish the database 
comprising themes of contours, rivers, swamps, soil types and vegetation 
(Figure 1). This data was combined with historical observations and 
contemporary archaeological surveys of land features where possible. 
Nevertheless these sources would not portray or indicate prehistoric terrain 
conditions, while hydrology dynamics would change the position of swamps 
and oxbow lakes on the floodplains. Such positional changes could not be 
expected to materially affect analytical outcomes, as pathways would shift 
marginally within the same area to suit these changes. 

The management advantages of GIS database are well known (Longley et 
al. 2001) but its application to the Olmec research should be explained. The 
regional nature of the stone transport in Mesoamerica posed particular 
elements for which GIS tools were well suited. Nevertheless considerable 
scanning of hard copy maps and data processing was needed to form a 
comprehensive geo-database. In the early stages of this process it was evident 
that the resolution would determine interpretation quality required to generate 
a usable terrain model. This necessity became a compromise between 
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practicalities of managing file sizes and the need for reasonably accurate data. 
Larger files allowed clear image processing, however file handling was limited 
by available hardware (Figure 2). 

Our study began by developing a geographical surface model of the region 
using ArcView 3.2 software and its accompanying 3-D Analyst tool (Figure 
3). This model allowed us to geo-spatially analyse slopes in an extensive 
landscape and identify potential transport corridors without the need for 
extensive individual pathway calculations. The surface model was developed 
using current topographical maps of the area. Without ground-truthing, our 
interpretation and digitising is subject to the tolerances and accuracy of the 
source survey maps. Error when using a Digital Elevation Model (DEM) and 
the associated slope tools in the GIS was expected during this assemblage 
process (Hageman and Bennett, 2000). The digitising process itself was 
understood to be another potential source for errors, which arise from 
interpretation. This was a problem that was noted by others as are strategies to 
minimize the problem (Bolstad, et al., 1990, Morad, et al., 1996). 


TED 
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ay 


Figure 1. Preliminary digitising using INEGI survey map 1:50000 (L. C. Hazell). 
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Figure 3. 3D terrain scene using ARC10 Scene (L.C. Hazell). 


In the background the Tuxtla Mountains can be seen whilethe San Lorenzo Plateau is 
in the centre foreground. 
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The initial digitising procedure was undertaken using a HP 5100c flatbed 
scanner with a default scan speed for high quality scans. Resolution was set at 
a 150 dots per inch (DPI). Even with the larger scale of the survey maps, each 
map required at least four and sometimes six A4 scanned sheets that needed to 
be geo-located and joined in the GIS software. This also became a potential 
source of accumulative errors. 

In an attempt to overcome these potential errors, digital photography was 
used on the original survey maps. The camera used in this process was a 
Digital SLR Nikon D1X with a 28-70mm 2.8 lens. The picture quality setting 
for this process was the RAW format with a resolution of 300dpi. The 
photographs were taken outside in a shaded area with natural light and a 
distance of 1 to 1.5m between the map and the camera. Generally, shutter 
speed was 100" of a second, but the aperture was varied between f18 and f11 
with a focal length between 42 and 31 mm, which when translated into 35 mm 
film equivalents, was between 63 mm and 46 mm. The advantage of speed and 
efficiency that was anticipated by this procedure was negated by a lack of map 
clarity and consequent limitations on interpretation of the final photographic 
images. This was in spite of enhancement using Adobe Photoshop 7 filters and 
image adjustments to the image sharpness, brightness and contrast. 

With these disappointing results we returned to the initial process of 
joining A4 scanned images, see also (Hazell and Brodie, 2012). 

To keep within a practical research time frame we adopted a multi-layered 
approach by capturing thematic layers from scanned 1:250,000 survey maps. 
In later analysis, maps with a scale of 1:50,000 were used to provide greater 
detail in specific areas of interest. All maps were scanned and enhanced using 
Adobe Photoshop 7 to improve legibility and associated accuracy during final 
digitisation. Nevertheless our interpretation of specific features such as 
contour position and value imposed limitations on final accuracy. Points on 
contour lines were inserted at changes in direction and intermediate points 
were included as frequently as possible to minimize error (Douglas and 
Peucker, 1973). The individual maps were joined into a matrix, as shown in 
Figure 3, from which an initial analysis could be undertaken. The contour data 
was then used to create a surface model of the landscape as a Triangular 
Irregular Network (TIN) that was formed from our digitised contour data 
(Figure 4). 
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Figure 4. 3D scene of same area using Arcview 3.2 (L.C. Hazell). 


In the background errors noted in the text due to contour crossovers in digitising can 
be seen. These did not affect analyses. 


The slope gradient and 3D model tools produced analyses and 
representations with visible errors formed through contour crossing or 
incorrect digitising. This occurred with the earlier version of Arcview 3.2; 
however, when digitising contours using ARCMap 10, it was necessary to 
zoom in or deactivate the Snapping tool to avoid crossing contours or incorrect 
joining of different elevation contours. The same technique applies when 
editing vertices. This problem occurred only when contours were close to 
together. 

In spite of this problem, only a small percentage of the final map area was 
affected and usually these contour crossings only occurred when contours 
were very close together indicating a steep gradient. This would automatically 
exclude these parts of the resulting terrain surface from consideration as a 
transport route because of the excessive gradients involved. Attempting to 
control a twenty tonne stone on steep downward slopes would have been 
impractical as replication and historical observation illustrated (Dillon, 2004, 
Heyerdahl, 1958, Richards and Whitby, 1997). In any case the analyses 
identified safer, more practical options that were nearby (Figure 5). 
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|. ArcView GIS 3.2 
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Figure 5. Detail of Slope Gradient analysis using Arcview 3.2 (L.C. Hazell). 


The darker pixels indicated the steeper gradients and generally these are gradients 
which modern roads or tracks follow. 


The analysis could have been undertaken empirically by simply reading 
the contours on the map or trace likely routes that showed minimal grades and 
then marking these on photocopied survey maps. But this would have 
necessitated either assessing contour spacing visually or measuring and 
calculating the grade along a multitude of potential pathways. The study area 
exhibits complex landforms, reflecting their volcanic origins. Using the slope 
gradient tool in ArcView ensured consistent analyses. This is particularly 
valuable when working with large areas and small-scale maps. Slope is 
defined by pixel size. Adobe Photoshop pixel counts and grid value tools 
determined the ground area that each pixel represented on our maps. Therefore 
a linear distance of: +40 m on 1:250000 scale maps; +16 m at 1:50000 and +4 
m for 1:16000 per pixel applied in our maps. 

To manually sample slope grades at these scales would require analysis of 
each change of contour value across a potential pathway. The total fall from 
the source to the floodplain is some 400m in vertical height with contour 
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intervals between 10 m and 20 m. Analysis of this area and gradient range 
would require between 50 and 260 individual measurements for each potential 
path even if this was confined to a width of only 500 metres. Therefore, the 
slope gradient tool is more efficient in research hours and provides acceptable 
accuracy for route options that can be used with other data such as soil types, 
vegetation and hydrology to corroborate likely pathways (Figure 6). 
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Figure 6. Route corridor options using Arcview 3.2 (L.C. Hazell). 


LIMITATIONS ASSOCIATED WITH GIS APPLICATION 
TO THIS RESEARCH 


A major limitation in this study was the process of data acquisition. 
Digitising such extensive landscapes is always a challenging undertaking. 
Without access to large scale remote sensing technologies, this study relied on 
the digitisation and interpretation of topographical land survey maps. Errors in 
interpretation of contour lines do occur and may have influenced the gradient 
analysis, however the points of shallowest gradient in the landscape 
correspond to those parts of the digitised map which are easiest to interpret 
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accurately. Additionally, correlation between the route corridors identified by 
this method, show similar pathways to modern roads in the general area. 
Therefore the choice of potential transport paths, based on gradient analyses 
has validity. The benefits of GIS slope gradient and thematic database analyses 
are emphasised when linked to other research methodologies including: soil 
bearing capacity; friction coefficient studies; and other technology such as 
High Resolution Satellite image analysis. However we also noted some 
limitations while using GIS with this research: 


e Care must be taken when using data from ARCView3.2 files and its 
coordinate systems with in ArcScene 10. The need to ground truth 
data is highlighted by this problem. 

e ArcView 3.2 Slope gradient tool outcomes were easier to interpret 
than those derived from ArcView 10. 


CONCLUSION 


The pathways identified in these analyses are not final solutions to the 
megalith transport problem; however a research project of this type and scale 
required a manageable protocol from which further research, including field 
surveys, could be established. The purpose of this paper was to highlight the 
value and options of using GIS software for doing this. We have noted some 
limitations in its use but we have also demonstrated how this technology 
contributed to this research. It is proposed further studies of Olmec land 
transport be undertaken on the basis of this work. Research is also proposed 
that will include megalith transport in Neolithic Britain, to retrieve the 
Bluestones used in Stonehenge and standing stones on the Orkney Islands. 
Utilising other GIS attributes, research will be extended to human energetic 
aspects of megalith use. This focus will include constructing various funerary 
enclosures at Nan Modal on Pohnpei and Palauan terraces in the Oceania 
region. 
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ABSTRACT 


This chapter presents a tool for studying natural disasters, such as 
seismic and impact events, using real data from Catalogs of earthquakes 
(EC) and Earth’s impact structures (EISC) [1]. It is ENDDB [2] (the 
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Earth’s Natural Disasters DataBase), a new version of a geoinformation 
system (GIS). The algorithms implemented in ENDDB allow visualizing 
a selected part of a current catalog in a pseudo-3D background map. With 
its mathematical support, the ENDDB system can plot frequency 
dependences of magnitudes or sizes (crater diameters) of events from 
various samples, as well as other distributions of integrated parameters in 
time and space, or their relationships with one another. 

The expert earthquake database (EEDB). There existed an earlier 
GIS version, an EEDB system [3], which had a wide range of 
seismological applications. It was gradual transition from a conventional 
GIS (originally created by the authors) to a high-tech expert system 
updated by including successively various mathematical methods for 
earthquake data processing, new parameters of seismic regime, and 
advanced representation tools. The realized algorithms [4] allow the user 
to compute and visualize maps and diagrams of seismicity parameters 
(slope of magnitude-recurrence curves, seismic quiescence, earthquake 
density, etc.), to reveal clustering of events, and remove aftershocks. 
Modifications and versions of GIS-EEDB for different geodynamic 
regions [5] are illustrated in the chapter with case studies of seismic 
anomalies. 

Visualization and analysis of EISC data. Applying the EEDB 
system software to EISC data [1] (in the new GIS system, called Earth’s 
Impact Structures Catalog (EISC) [6]) allows gaining insights into spatial 
patterns of impact structures. In addition, the shapes of craters are 
constrained using a shaded relief model based on NASA data arrays of 
SRTM (Shuttle Radar Topography Mission) and ASTER GDEM (Global 
Digital Elevation Model), and the technology of digital mapping. Thus 
typical elements of impact craters morphology have been systematized 
and can be used as indicators of the crater origin [7]. 

Gravity data and new applications of GIS ENDDB. The reliability 
of geomorphically expressed diagnostic indicators of crater shapes was 
checked against geophysical features revealed by gravity data, namely, 
the presence of tail-shaped negative gravity anomalies produced by large 
impact craters. By mapping gravity anomalies, using our shaded relief 
model and “Global marine gravity” data (V18.1), we can verify the 
gravity patterns associated with impact cratering and check their validity 
as tracers of bolid trajectories. Gravity data also have seismological 
implications and can be used to identify seismic blocks, lineaments, and 
other structures detectable with GIS ENDDB mathematical tools and thus 
to analyze the spatial patterns of seismicity. 
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In the late 1980-s, a custom system with GISelements for simulating 
tsunami waves was designed at the Institute of Computational Mathematics 
and Mathematical Geophysics (formerly the Computing Center) in 
Novosibirsk. The system allowed setting an elliptic model source of the 
tsunami and then calculating the arrival times of tsunami waves at all grid 
nodes of the modeling domain using two built-in programs, as well as 
estimating the height and velocity wave components in subregions selected 
from a large offshore and coastal area. The digital geographic map had a 
complex structure, with the coastline and sea depth contours presented in a 
vector format and the land topography and bathymetry displayed by different 
colors on the computer screen. The system was first demonstrated in 1989 
during the International Tsunami Symposium [8]; its description was reported 
at the All-Union Tsunami Workshop in 1990 [9] and at the XX IUGG General 
Assembly in 1991 [10], and in Computing Technologies [11]. Concurrently, a 
lot of work was initiated by A. Mikheeva in 1990 to develop an independent 
GIS for visualizing the Earthquake and Tsunami Database on a map and to 
collect information for the Database [12]. 

In 1994 the systems were combined into an expert system (Expert 
Tsunami Database, ETDB) for visualizing earthquake and tsunami catalogs 
with the possibility of modeling different scenarios of tsunami generation and 
propagation from model sources [13]. A modification of ETDB [14] was 
recommended as a prototype for regional Tsunami Databases at the Fifteenth 
Session of the International Coordination Group of UNESCO for the Tsunami 
Warning System in the Pacific [15]. At the same time, Anna Mikheeva and 
Petr Dyadkov began to develop a parallel version of an expert system (Figure 
1) for seismicity studies in the Baikal region (Expert Earthquake Database, 
EEDB), at the Trofimuk Institute of Petroleum Geology and Geophysics 
(Novosibirsk). It was gradual transition from a usual GIS and DB to a high- 
tech expert system updated by including successively various mathematical 
methods for earthquake data processing, new seismicity parameters, and 
advanced representation tools. 

The term “expert” in the name of these systems reflects one of its basic 
features: providing the user with necessary information and advanced 
techniques for specific research tasks. 
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Later subprograms were designed for seismic and tsunami data processing 
in various samples from databases in ETDB, in addition to updated Digital 
Mapping [16]. Visualization of digital mapping was implemented in several 
projections (geographical, orthographical, etc.). The system required 
geographical and geophysical data, which either were unavailable to the 
developers or did not exist in the digital format. To provide the system tools 
with such data, an interactive system was developed for digitizing bathymetric 
and other paper maps [17-18]. Specifically, the same approach was used to 
create vector data of seismogenic faults in some areas of the Russian Far East 
and a new detailed bathymetry grid for areas around the Kamchatka Peninsula 
and the Kuriles [19-20]. 

The following versions of geographic information systems created using 
the above achievements of the authors appeared in the 2000s and covered new 
subject areas of geology and geophysics. The ETDB, EEDB, and ENDDB 
systems all share the same structure and represent a set of interacting software 
units: a relevant database subsystem (tsunami, earthquake, or impact structure 
databases), a geographic subsystem, and a subsystem for data analysis, joined 
in the user interface part. The first unit in the prototype systems we developed 
before ENDDB and applied to tsunami research [14, 21, 22] contained an 
earthquake and tsunami database, while in its latest modification ENDDB was 
a database of earthquakes and impact structures. 

The user interface part of the systems have changed significantly, from 
graphical shell functioning in the MS-DOS environment (Figure 1), with 
Turbo-Pascal tools, limited by the resolution of the EGA and VGA graphic 
adapters, to the up-to-date Windows standards of menus and dialogs (Figure 
2), produced by means of the MFC-library. The environment formed by this 
library defines the skeleton of the application to be developed and provides the 
developer with standard tools of creating a multi-window interface. The first 
prototype (WinETDB-project) of the Windows-95 user interface was created 
by Denis Ivaykin, Alexander Lyskovskiy, and Ekaterina Chernykh, students of 
the Higher College for Information Theory at Novosibirsk University, who 
also converted the geographical subsystem into the Visual C++ codes [22, 23]. 

Recently A. Mikheeva has adapted the development environment and all 
subsystems of ENDDB to a 64-bit platform, and made a new version of the 
package for Windows 7 and 8. The same work was carried out for the 
environment of supported applications designed specially to complement the 
resources of ENDDB. The codes of the main program and its supported 
applications were translated to the standards of the latest Visual Studio and 
Firefox versions. One such application, a new converter of seismological 
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formats for 64-bit computing has been written with assistance of 
Magomedrasul Magomed-Kasumov, engineer of the Dagestan Branch of RAS 
Geophysical Surveys. 

This chapter gives a review of a tool for studying natural disasters, such as 
earthquakes and impact events, using real data from catalogs of earthquakes 
(EC) and Earth’s impact structures (EISC) [1]. It is a new version of a 
geoinformation system (GIS), the ENDDB (the Earth’s Natural Disasters 
DataBase [2]), which combines two earlier systems: GIS EEDB [3, 24] and 
EISC [6]. These two prototypes are described in more detail in Parts 2 and 3 of 
the chapter. The methods implemented in ENDDB allow visualizing selected 
parts of a current catalog on a pseudo-3D background map. With its 
mathematical support, the ENDDB system can plot frequency dependences of 
magnitudes or sizes (crater diameters) of events from various samples, as well 
as other distributions of integrated parameters in time and space, or their 
relationships with one another. 


CATALOGS | 
EARTHQUAKES | 
CLUSTERS | 
S-network | 

E Rexenve | 
Info | 
Restore | 
Quit | 


Figure 1. The main window of the EEDB, DOS platform. 
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Figure 2. The main window of ENDDB, Win 64 platform. The original World map for the primary region choice. The option of the 
earthquake catalog listing is open. 
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Figure 3. The flow chart of the GIS-EEDB software. 
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Figure 4. Regional zoning according to the completeness of earthquake catalogs. Geographical layers on the elevation map: earthquakes 
(Ms = 5), country boundaries, rivers, and different fracture zones. Dashed line shows the GS RAS region boundaries according to [25, 
26]. 
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The geographic subsystem of ENDDB, which is an extension of the 
previous geographic shell [22], comprises the following components: 


* primary geographic data; 

e set of algorithms and processing software modules; 
* map visualization software; 

e tools for forming and implementing spatial queries. 


Methods of digital mapping and GIS technology have made it possible to 
create a system for selecting and visualizing geophysical data on a 
cartographic base, which serves as a tablet (in our case, in the rectangular 
projection for plotting data of catalogs and other related data). Description of 
the methods is given below (Part 1). 

The main problem in cartographic modeling of the Earth's surface 
topography is to choose the simplest and most appropriate way of populating 
the Geographic Database. Mapping in our system is performed using a grid 
digital elevation model. The initial region choice is made from an overview 
World map stored in the memory, in a rectangular projection. Once the 
elevation map, the shoreline contours, and the default geographic layers appear 
in the window, the region can be selected using the inverse frame (Figure 2). 

Currently, the total volume of the auto-generated and author’s program 
code (2/3 created by Anna Mikheeva) of GIS-ENDDB is about 23 MB and 
consists of ~ 240 classes. 

The program product GIS-ENDDB can be installed on any computer from 
the original media based on Win32 and 64-bit platforms (including Windows 
95, 98, 2000, 2003, 2007, 2010 NT/XP, and Windows 7 & 8) and requires 
approximately 4 GB of memory on a hard disk. 


PART 1. THE EXPERT EARTHQUAKE DATABASE (EEDB) 


P. G. Dyadkoy, A. V. Mikheeva and An. G. Marchuk 


The EEDB interactive computing system [3] was developed by the authors 
for seismic and geodynamic applications and can be considered as an 
automated workstation for users engaged in seismic and geodynamic research 
of different geographic areas on different scales. The EEDB flow chart [24] 
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(Figure 3) represents a set of interacting program units: a seismological 
database, a geographical subsystem, and a subsystem for data analysis. 

The seismological DB lying at the base of the system contains 63 catalogs 
of historical and instrumental earthquakes, among which both the catalogs 
from known agencies and geophysical surveys, and the authors' own catalogs 
representing incorporated and filtered data collected from various sources. The 
reasons for using several catalogs are as follows. First, this allows dealing with 
both global- and local-scale (world and regional catalogs, respectively) 
earthquake data. Second, by comparing different geographically overlapping 
catalogs in terms of completeness, with the use of the EEDB methods 
(earthquake frequency histograms and plotted time series of magnitudes), one 
can divide the map into zones (“regions”) and select the respective preferable 
catalogs (Figure 4). 

We most often use global catalogs (e.g., NEIC, 1973 to present, and 
SIGN, -2000 to 1993, of the American Geophysical Survey, USGS) and 
regional catalogs, such as the Baikal catalog (BAIK in sequel, in which 87% 
of data belong to the Baikal Branch of Geophysical Survey (GS) SB RAS, 
available at http://www.seis-bykl.ru) and the Altai catalog (Altay-Sayan 
Branch of GS SB RAS). The overlapping boundaries of catalog 
completenesses we revealed for the case of the BAIK catalog are even wider 
than those indicated by the authors of the GS catalogs (Figure 4) [25, 26]. 


Year _| Month | Day | Hour | Min | Sec | Latitude | Longitude | Depth| Ms | KI | Book | Length 
1964 10 17 20 SO 12,0 52,28 106,50 o 4,5 11,5 11 4,85 
1964 12 ai (30 41,0 52,37 106,30 o 4,5 12,0 11 4,35 
1965 5 6 2 | 33°] 100 52,55 107,00 o 3,9 11,0 11 2,61 
1965 s 29 15 42 590 54,72 109,28 o 4,0 11,0 11 2,92 
1965 6 | is | a [220 53,38 107,82 o 3,9 11,0 11 2,61 
1965 7 17 15 6 580 51,07 109,73 o 3,9 11,0 11 2,61 
1965 7 17 19 37 22,0 53,32 108,38 o 3,9 11,0 11 2,61 
1965 8 2 11 |45 | t40 51,81 105,15 o 3,9 11,0 11 2,61 
1965 i 14 15 47 43,0 52,73 106,61 o 4,0 11,0 11 2,92 
1966 1 22 5 37 44,0 55,27 109,77 o 3,6 10,5 11 1,97 
1966 4 3 6 13 39,0 54,00 108,62 o 4,0 12,0 11 2,92 
1966 7 2464 «0622 «(39,0 53,59 109,01 o 3,6 10,5 11 1,97 
1966 8 30 6 110 L30 51,76 104,61 2 5S 14,0 11 13,35 
1966 9 25 18 14 46,0 54,00 108,81 0 3,6 10,5 11 1,97 
1966 10 31 15 40 45,0 52,66 107,21 o 4,0 11,0 11 2,92 
1966 11 21 14 2 |-59,0 51,64 104,47 o 3,6 10,5 11 1,97 
1966 11 2 | 20 |16 | 46 52,85 107,03 o 3,9 11,0 8 2,61 


Figure 5. Earthquake catalog format. 


The format of the earthquakes contains fields with key source parameters: 
energy, magnitude, epicenter size (length), origin time, location (coordinates), 
and depth (Figure 5). The released seismic energy is the principal 
characteristic, which may be expressed either on the magnitude or on energy 
scales. Most catalogs worldwide (e.g., NEIC) use the magnitude scale based 
on body waves (mb) while Russian catalogs use the energy scale (e.g., that for 
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the Altai territory), with energy classes K = IgE determined by T. Rautian's 
technique [27]. To bring all data to the universal scale, we convert energy 
classes to surface-wave magnitudes (Ms) using Richter's formula K = 
4.8+1.5-Ms (for the global catalogs) and its variants K = 4.0+1.8-Ms for K<14 
and K=8+1.1-Ms for K>=14 (applied to regional catalogs, specifically, to 
events in West and East Siberia). Ms is recalculated from mb as Ms = (mb — 
2.4)/0.5556, which is an empirical ratio, obtained from known magnitude 
pairs. The length of seismic rupture (L) is found as lg L = aK +c, where a and 
c are empirical constants [28]. 

The seismological unit also includes preprocessing of the initial catalog 
data to select a subset of earthquakes according to the inquiry parameters: 
choice of a current catalog, time range, space range, magnitudes, etc. 
Furthermore, the users can filter the selected earthquakes from aftershocks, 
with three independent algorithms. The first algorithm (tentatively named a 
statistical algorithm) is based on parameters responsible for the space-time 
difference between aftershocks and the main shock (dT and dS), which have 
been obtained from the available aftershock statistics and depend on the main 
shock magnitude: dT = (M; main — 4) +162; dS = 3*L. 

The second (elliptical) algorithm (Figure 6), most frequently used to 
remove aftershocks, consists of several runs: 


1) run 1: estimating the density of non-aftershock events (aftershocks are 
removed according to the statistically found parameters); 

2) run 2: preliminary removal of aftershocks on a rectangular grid with 
the cell size proportional to the main shock magnitude; 

3) plotting an aftershock ellipse isolating the aftershocks by the 
maximum-likelihood method or according to rms deviation from the 
sampling center. 

4) subsequent runs: separating aftershocks level-by-level, in the elliptic 
metric. 


At Steps 2 and 4, the time window (dT) of aftershock search increases 
proportionally to the ratio of the current number of aftershocks to the total 
number of events within rectangular or the elliptic areas [29]. 

Prozorov's method has been modified by A. Mikheeva as follows: 


e all located aftershock sequences are considered simultaneously in a 
single run, 
e the minimum size of the rectangular metric is set interactively, 
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e unlike the classical case (Figure 6 B, c) of calculating the elliptic 
metrics, it is suggested to create an ellipse of equal probability (Figure 
6 B, a). 


In the third method, called interactive, the space-time window values (dS 
and dT) are set up by the user. The results of the elliptic method are considered 
below in more detail (Figure 6). 

In the elliptical method, which includes the described steps, the following 
parameters are set up: threshold signal/noise ratio Rsm, minimum main shock 
magnitude, minimum aftershock magnitude, minimum size of the rectangular 
metric, etc. The classical way of finding the ellipse parameters [29] may be 
with or without weighting (Figures 6 B, b and c), which depends on the 
number of events that fall into the cell. Weighting makes sense if aftershock 
swarms are strongly scattered. 

Experience has shown that, in some cases, our modified method of 
identifying aftershocks may be advantageous, in which ithe spatial pattern of 
aftershocks is constrained by an equal probability ellipse: 


1 2 2 W ,¥2 2 
0 (xy) = —-—2 — + =)=const=A°, 
pa, “a Pia, * a) 


where X? x 2+ ( 1- + 3.29 - =) is approximation of the quintile 
distribution with two degrees of freedom at P = 0.9995; a7= DX, œ= DY — 


are the variances of x and y, and Piz is the correlation coefficient between x 
and y. Thus estimated ellipse parameters for identifying aftershocks of the 
09.16.2003 earthquake in the northern Baikal Rift Zone (BRZ) exceeded those 
obtained by the classical way in both number of selected events (263 and 246 
respectively) and aftershock sequence length (3.9 and 1.6 times, respectively, - 
Figure 6 B, e). Another advantage of our modified method is that the results 
are almost independent of the R,,, threshold. 

The classical and modified aftershock removal algorithms were compared 
in terms of efficiency by estimating the statistics of the resulting sets (Figure 6 
B, f). The classical removal of aftershocks has shown significant deviation of 
the observed distribution from the theoretical Poissonian distribution [30] both 
before (Figure 6 B, f - 1), and after the procedure (Figure 6 B, f - 2), while the 
modified algorithm shows no deviation (Figure 6 B, f - 3). 

The exponential Poisson distribution is given by: 
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fr (t) = 2 exp (-At), 


where t = dT is the time between two subsequent earthquakes, À is the flow 
rate of events in time, 2 = N/t,y, where Tay is the average recurrence time. 
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Figure 6. A — Histogram of the daily number of aftershocks of the 2003 Altai (Chuya) 
event. Shown on the right is a map view of the aftershocks. B — Options for calculating 
elliptic metrics to determine aftershocks, with an example of the M = 5.8 earthquake of 
16.09.2003 in BRZ: a) confidence ellipse (equal probability), b) weight root-mean- 
square deviation [29], c) root-mean-square deviation, d) space distribution of events 
before removing aftershocks, e) time distribution of aftershocks identified by different 
ellipse metrics (a), (b) and (c), Ryn = 15, f) deviation of the observed number of 
earthquakes (M > 1.5 since 1987) from theoretical Poisson’s distribution before (1) and 
after (2, 3) removing aftershocks that followed the M = 5.8 event of 13.05.1989. 
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After the aftershock removal procedure, the sample contains many related 
events making up swarms. The swarms are removed in the same way as 
aftershocks, except for the magnitude relation between the main and 
subordinate events: in the case of swarms, the events following the one that 
triggered the swarm process may have any magnitude, either larger or smaller 
than the trigger event. Another point of difference is that the duration of the 
seismic process for swarm sequences is specified by the user interactively 
rather than being calculated (as in [29]) from the number of events as in the 
case of aftershocks, because the time distribution of events in a swarm is 
different from that in an aftershock sequences. 

The geographic subsystem is an important part of any GIS, because 
proper visualization of geophysical data is a necessary prerequisite for correct 
interpretation. Methods of digital mapping and appropriate GIS-technologies 
allow developing a system for visualizing seismological data on the 
cartographic basis to meet the user's requirements for plotting earthquake 
catalogs, as well as the related information. 

The GIS-EEDB subsystem of cartographical support uses shaded-relief 
raster images for creating digital geographic maps. The 3D-effect is provided 
by successive triangulations and by calculating the brightness of triangles. 

First, the whole selected area is triangulated, with splitting each 
rectangular grid cell at least into two triangles (maximum 16 triangles). The 
number of triangles depends on the user-specified enlargement of the picture 
size relative to the original data array. The program computes the brightness of 
each triangle based on its orientation relative to the light source. In particular, 
if the original array and the created image are of the same size, the light plane 
is assumed to pass through the top left, top right and bottom left vertices of 
each cell. If the illumination direction is different, the cells are split in a 
different way as well. The parameters of illumination and the color scale are 
specified by the user, and various shades of brightness are then obtained 
therefrom. 

The basic steps of the shading algorithm, designed by An. Marchuk, are as 
follows: 


1) A surface is divided into a certain number of triangles calculated from 
the ratio of the desired picture size and the dimension of the array 
where the heights of all vertices with respect to a reference level are 
known for each triangle. 

2) At each triangle, the vector product is found for vectors, parties to this 
triangle. 


Geoinformation Systems for Studying Seismicity ... 165 


3) The resulting vector is normalized. 

4) The brightness of the triangle is calculated according to the angle of 
the resulting normal vector to the light direction in the 3D scene. The 
light source is usually taken with parallel rays. 

5) The computed brightness is used to obtain luminance gradation of the 
corresponding color determined by the above-sealevel elevation of the 
point and the user-specified color scale. 

6) The 2D projection is drawn according to the parameters of the picture 
plane. 


Similar algorithms known in the literature on 3D computer graphics are 
Gouraud Rendering (or Gouraud shading). To construct maps of different 
scales, from global to local maps, an appropriate data array with an optimum 
spatial resolution is automatically selected, with the quantity of triangular 
elements in each grid cell fitting the selected map scale [31]. Currently there 
are some global databanks representing the surface topography to different 
resolutions [32], such as the best known GTOPO-30 and SRTM-90 ones, with 
30 and 3 arc-second resolutions, respectively. The GTOPO-30 and SRTM-90 
databanks are open-file digital elevation models developed by the U.S. 
Geological Survey (USGS). The mapping program uses higher-resolution (90- 
m) SRTM-90 data at the local level, when zooming to specific areas of interest 
(in the territory of Russia). Then the vector and point layers, and explanatory 
texts, are superposed onto a raster image. The vector technology is applied to 
the level-by-level screen visualization of shorelines, rivers, national frontiers, 
fractures and faults of different geometries (thrust, reverse, strike-slip, normal, 
and oblique-slip faults). This technology aims at reducing the amount of saved 
data by storing the coordinates of linear objects as vectors. The thickness of 
lines in the image remains constant when objects are zoomed. 

The point information is stored in simple text files and consists of such 
layers as geophysical observation points and locations of volcanoes and 
settlements, and can be easily populated with any other point data. 

Full description of layers in the geographic database is given in Table 1. 
The system is open to adding vector, raster, and point digital geographic data, 
which can be converted into the system format. 

Additional options developed for the geographical subsystem include: 


a) different ways of visualizing (with or without animation) earthquakes 
in geographic maps and cross sections (Figure 8); 
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b) a mapping algorithm for creating a series of zonal maps at equal time 
intervals using linear interpolation of 2D Bessel function (1); 

c) a program for seismic division for simultaneous analysis of several 
tectonic areas. 


The mathematical method of linear interpolation (2D Bessel function) is: 
Z(x, y) = 4 (Zoot Z10 + Zo1 + Z11) + ¥2 (u 2) (Z10— 200+ Z11— 201) + 


+ Ya (v 2) (Zor — Zoo + Z11— Z10) + (Uu 2) (v 2) (Z11— Z10— Zor + Zoo) +... 


(1) 


where Ax=Ay=h is a fixed increment and: 


Zz (x) + jAx, yy + kAy) = Zik (pk = 0, 1,+2,.. 
O i „=J 
Ax” Ay 


The visualization technology of the zonal maps, which uses the 2D Bessel 
linear interpolation by (1), includes three stages: 


1) Dividing the area into elementary cells; 

2) Calculating the sought parameter for each cell; 

3) Mapping this parameter (by linear interpolation) for successive time 
intervals before a large event. 


See Figure 9 for the mapped damage parameter Kavg as an illustration to 
this technology. 

The data analysis subsystem includes methods and algorithms [4] for 
GIS analysis of earthquake catalogs based on our research and on published 
results by experts in seismicity and geodynamics. 

The first layer contains procedures of checking the completeness and 
quality of catalogs using the time series of earthquake number and magnitudes 
(M(t) and Ms(t), respectively) [33]. The next layer provides visual analysis of 
seismic parameters and consists of a graphical and a cartographical sublayers 
(Figure 3). The graphical methods apply to plots, histograms, and diagrams, 
including petal and azimuthal diagrams. They are, for example, histograms of 
released seismic energy averaged over a selected time interval (/g Eavg(t), 
joules), magnitude vs. frequency relationship (linear-regression empirical 
histograms of the number of events of certain magnitudes), slope of recurrence 
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curve vs. time b(t), etc. The empirical regression line of the magnitude- 
frequency function is obtained using the maximum-likelihood and least 
squares methods. The b-values can be also estimated using Utsu's formula 
[34], without calculating the N(M) function, to a better accuracy than with the 
least squares. 


Table 1. Structure, format and content of the GIS EEDB geographic 
database and the relevant geographic layers 


Data format Folder Layer names Information content 
Raster data : ae : 
(files * bin) Raster Relief Digital elevation model 
Coast Data on coastlines 
Rivers Global network of rivers and lakes 


; 3 Detailed local networks of rivers and 
River details 


lakes 

Country National frontiers 

Republics Frontiers of autonomous republics 

C Administrative division 

Roads Roadway network 
Vecordaa Railways Railway network 
(files *.vec) Vector Plates l Plate poündaries 

Tectonic Seismic lineaments 

Zones 

Faults Fault zones 

Thrust Thrust faults 

Normal Normal faults 

Oblique-slip Oblique-slip faults 

Reverse Reverse faults 

Reverse- Reverse-oblique-slip faults 

oblique 

Cities Cities and towns 

Earthquakes Earthquake epicenters 

Volcanoes Locations of volcanoes 

TideNet Tsunami observation points 
Point data Point SeismNet Seismic observation points 
(files *.txt) Mag points Tectonomagnetic observation points 

Resists Vertics of regions covered by 

different catalogs 

Mech points Earthquake focal mechanisms 

Computable Grid Geographical network 


data 
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Figure 7. A — Time series of rupture density Kavg [35] for the Chuya earthquake area, 
M, = 2: 1) real curve with AT = 1 year, 2) theoretical curve of uniform fracture growth 
(AT = 1 year), 3) real curve with AT = 1 month shifted upward. Arrows show 
coincidence of curves | and 2. Box frames a fragment corresponding to the period 
between 1987 and 2000 when the real curve flattened out with AT = 3 months. B — real 
curve of Kave for the Chuya earthquake area, M, > 2.5. The curves have become 
smoother since 1986 and fall abruptly since 2000. 


The module for calculating the parameter of environment damage or 
seismic rupture density Kavg(t) (Figure 7) was designed one of the latest among 
the graphic methods. The time series of this parameter provide an idea of 
seismic stability reflected in physical changes of the environment. The 
stability is understood as uniform increment of rupture length and number of 
earthquakes. 

The cartographical methods imply contour line mapping, animation 
cartography (visualizing earthquakes as gradually fading flares spaced at time 
intervals proportional to real time) and constructing vertical cross sections and 
patterns of elevation and seismicity (Figure 8). 

The built-in mapping subsystem unit can produce cartograms showing 
distribution of such seismicity parameters as total seismic energy, which is 
useful to highlight zones of quiescence preceding large earthquakes; 
distribution of the b parameter (slope of recurrence curve); maps of energy 
stability (Kavg) and its rms error o; contour lines of seismic activity (Ayo, A15, 
where A is a long-term average number of earthquakes of certain energy: K = 
10, 15) (as, for example, the Kavg map in Figure 9). 
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Figure 8. Vertical cross sections of topography and seismicity in Northeastern Japan, 
offshore area around the Tohoku earthquake (a), the North Baikal region (b), and area 
of the Pacific subduction zone (c). 


Clustering of earthquakes (grouping according to their relations) is 
another method of spatial data analysis. The earthquake clusters are associated 
with natural localization of seismicity in zones of active faulting, e.g., along 
boundaries of plates or blocks. Clustering of earthquakes has implications for 
the pattern of seismicity, which then can be compared to locations of 
geological structures. To reveal earthquake clusters, the user has to specify the 
maximum space and time distance between events in all pairs (dT for time and 
dS for space) and the type (temporal or spatial) of clustering. 
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Figure 9. Spatial distribution of cumulative rupture density Kag for Ms = 2.0-4.5 events 
in 1964 through 2007: Altai-Sayan area, lat. 46-52° N; long. 82-91.35° E (top panels), 
cell size 0.3x0.5°; area of Chuya earthquake of 27.09.2003 (bottom panel), cell size 
0.2x0.3°. The maps present the data at every 10 years. 


Cluster analysis programs allow plotting time series of the number of 
related events (with or without averaging), relationships of the number of 
events in clusters vs. total number of events, the conformity of the clustered 
events to the Poisson distribution N(dt), etc., or constructing rose diagrams of 
cluster azimuths. 

Detecting and analyzing clusters of related events shows the distribution 
of seismicity in space while comparing the seismicity patterns in different 
periods of time traces the history of the seismic process. 

The EEDB system uses several methods to reveal earthquake clusters: 


e introduction of space-time parameters dT and dS; 

e Sobolev's method (calculating dT and dS automatically proceeding 
from the fractal space theory and physics of fracture) [36]; 

* estimating earthquake density using Morisita's index [37]. 


The clusters obtained by the first method are analyzed using jointly the 
geoinformation approach (methods of cartography) and elements of graphic 
analysis. 

For example, changes in faulting activity can be detected from patterns of 
small shocks, in several steps: 
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1) Preliminary step: selecting a part of a catalog and filtering it from 
aftershocks and earthquake swarms (see above); 

2) Main steps: detecting clusters using the relation parameters dT and dS. 
The values of the introduced parameters can be determined, for 
example, using dT or dS dependences of the number of earthquake 
pairs (two nearest successive events), and revealing intervals in which 
the number of pairs exceeds notably the exponential distribution (for 
dT) or earthquake number maxima (for dS) (Figure 10a); 

3) Visualizing the azimuth patterns of segments that connect events 
nearest in time or space (clustered earthquake pairs). For the time 
analysis of the seismic process, the considered period is divided into 
intervals (years, months, etc.) to reveal clusters with anomalous 
orientations of the pairs. For example, the rose diagram for 1991 show 
the azimuths of clustered pairs in the Baikal rift to differ markedly 
from the usual pattern because 55% of the pairs have SE-NW 
directions unlike the N—S and SW—NE directions typical of the rift 
area (Figure 10b); 

4) Checking whether the revealed clusters belong to aftershocks and 
swarms nearest in time, if they have similar primary azimuthal 
orientations (Figures 1 1a-c); 

5) Checking whether the sample still contains overlooked aftershocks to 
make sure that the considered clusters are not the residuals of 
aftershock sequences (Figures 11d, 11e); 

6) Zooming separate fragments of a territory in order to study the 
clusters of interest in more detail; 

7) Additional quality checking of the cluster analysis results using a 
synthetic catalog created with real data points assigned to origin times 
the user specifies on random within a preset time range (an option 
realized in EEDB as well). The points in the synthetic catalog being 
invariable in space, the clusters hold their location but reduce 
considerably in number, which is implicit evidence for the validity of 
the previous interpretation made for the real data according to the 
mapping and plotting results. 


Thus, clustering, along with other techniques, belongs to methods used to 
analyze spatial seismicity patterns and to reveal, stage by stage, real geological 
structures and processes. There are also programs for plotting diagrams of 
earthquake mechanisms, which have implications for crustal stress changes, 
and some other cartographical techniques (see below). 
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Figure 10. Cluster analysis of selected data from BAIK Catalog for 1987-1992: a) dT 
and dS dependences of the number of earthquake pairs, with marked intervals for 
further clustering; b) rose diagrams showing the space azimuth patterns of the clustered 
pairs (dT = 1-20, dS = 1-50). See prominent SE-NW orientations in 1991; c) the length 
of the revealed clusters and the number of earthquakes in them: the longest clusters of 
1991 form a continuous sequence containing 12, 9, 8 and 4 event. 
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Figure 11. Aftershocks nearest in time: a) map of aftershock sequences and swarms 
detected by the program for 1987-1992; b) azimuths of aftershock pairs in 1990, when 
they markedly increased in number; c) number of events in aftershock sequences. 
Histograms describing the seismic process (2 < M < 4) in 1988-1992 after removal of 
aftershocks: d) total released energy per month and e) annual number of events. 
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The b-Value Method: Example Application of the Built-In 
Software for Data Analysis 


Besides being a way to determine statistically significant magnitude range 
(the arrows in Figure 12 A, b), the magnitude-recurrence relationship lg N = lg 
Am — b (M, —M,) characterizes the seismic process, in terms of the seismic 
activity (A) normalized to a certain magnitude M, (converted from the energy 
scale K) and the slope of the recurrence curve (b). Application of the algorithm 
for visualization of zonal maps can be illustrated with b-value cartograms. 

Mapping b values may be problematic because the seismic process is 
irregular within elementary cells and, if there are few earthquakes in a cell, 
they may fail to represent the whole range of the magnitude-recurrence 
relationships used for estimating the b value. Furthermore, the estimation 
accuracy depends on the number of events in a sliding window, as well as on 
the statistical uniformity of the data sample, i.e. the window should cover a 
spatial area of the same geodynamic type and correspond to a time interval 
free from critical changes in seismic parameters [38]. That is why the method 
should be applied with great care, investigating whenever possible the 
distribution of seismicity in each cell and, checking whether the data sample is 
representative. 

The b cartograms for the BRZ (Figure 12 A) have been generated on a 
dense grid (0.4x0.6°), at every 8 years. Such high resolution is possible due to 
the use of the complete BAIK catalog (which includes all M > 1.5 events) for 
the 1987-2003 interval and has an acceptable uncertainty: o = 0.08 (see legend 
on the right in Figure 12 A, a). 

The maps of isolines in Figure 12 A, a show a minor difference in average 
b values between the first and second time intervals, within one accuracy 
grade according to the color scale (0.90-1.05 and 1.05-1.20). The cartograms 
show the boundary of a considerably lower data density passing along 109°E 
between the southwestern and northeastern BRZ. 

Reliable definition of each value requires about 100 or more 
representative earthquakes of different energies [39]. We apply the concept of 
a confidence interval to obtain a statistically exact definition of a required data 
sample size and to allow for its influence on the accuracy of the estimated 
parameter. 

In practical applications, rather exact confidence intervals for selective 
estimation (of b in this case) can be obtained based on the Moivre-Laplace 
theorem [40] used in the probability theory, according to which the 3% 
accuracy can be provided if the data sample has the size (7) given by: 
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|u| / (24n) < 0.03 (2) 


Thus, if the confidence probability is 0.95 (an allowable statistical 
reliability of a conclusion), the quintile |u|o.95 = Uo.975 has the value 1.96. Based 
on (2), one can estimate the accuracy of the obtained values of b. For this 
purpose, one can contour earthquake numbers inside the cells in EEDB, i.e., 
draw the sample size isolines (Figure 12 A, c). So, for the 0.4x0.6° averaging 
grid the error is 6-18% (32 < n < 256) in the axial BRZ area, but reaches 35% 
(n = 8) outside the lake. To improve the accuracy and keep the same cell 
resolution, the parameter can be calculated using an averaging window with 
double overlap. The error in thus obtained estimates reduces to 4-6% (243 <n 
< 729) over the whole BRZ area (dark gray) and even to 2 % (n = 729) at some 
sites (black) 

An advantage of interpreting b cartograms is in implications for elastic 
energy buildup in the crust (stress sources): b = 1.8 (y = 1) if all potential 
elastic energy is released in an earthquake and b = 0.9 (y = 0.5) if the release 
is incomplete. Thus, the b value has bearing on relative amount of released 
elastic energy [41], and changes of the parameter in time correspond to 
changes in lithospheric stress and strain [42]. Proceeding from these 
assumptions, one may conclude that an increase in b observed, for example, in 
the central BRZ during the 1996-2003 interval records stress release in the 
area. The growth of average b values appears also in the time series (Figure 


13). 


Studying Spatial Seismicity Patterns with GIS EEDB 


The use of GIS EEDB has revealed important features in the spatial 
patterns of seismicity at the junction of orogens and stable blocks with rigid 
lithospheric elements. In the case of Central Asia, for instance (Figure 14), 
earthquakes tend to the East Tien-Shan and its borders with the adjacent rigid 
tectonic units of Tarim and Dzungaria, both with low seismicity. The 
hypocenter pattern in the vertical cross section along the A-B profile likewise 
indicates a high seismic activity of the Tien-Shan against the nearly aseismic 
Tarim and Dzungaria areas. The earthquake origin depths in this part of 
Central Asia are within 30-40 km. 
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Figure 12. A — b variations in BRZ for 1.5 < Ms < 4.5 earthquakes from the Baikal 
catalog: a) spatial pattern of the b parameter (slope of the magnitude-recurrence 
relationship) as a grid of b(s) values, or b cells (on the left) and zonal maps or b zones 
(in the center) for the 0.4x0.6° grid with double overlap, and the respective maps of the 
number of earthquakes in the selected grid cells with double overlap (on the right). The 
color scale shows errors in b value; b) the magnitude-recurrence relationship within the 
1987-2003 interval; c) a map of earthquakes number in cells without double overlap 
given for comparison and showing b errors exceeding 6% over the whole BRZ area. 
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B — b variations for earthquakes from the JMA catalog around Fukushima Prefecture 
(3.5<M<7, aftershocks are removed): a) spatial anomalies averaged over 4 years (grid 
size 0.4x0.6°, 3.5<M<7, rms error ø < 0.3, aftershocks are removed, the oval marks the 
source area of the pending Tohoku earthquake); b) time series, averaged over 5 years. 
Vertical black bars are one-sigma error bars (a), the oval marks points of ø < 0.09; gray 
boxes correspond to the number of events; black curve is the number ratio of small-to- 
large events. 
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Figure 13. The BRZ area framed in Figure 12 A, a (a) and variations in b parameter at 
every 4 years (b, top panel) and every 8 years (b, bottom panel). Errors in b estimates 
are shown by segments and calculated by formula (2). The average b value increases 
from 0.75 to 0.9. 
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Figure 14. Seismicity in the East Tien-Shan and the adjacent Tarim and Dzungaria 
areas. Circles are M > 3.5 earthquakes between 1980 and 2011. Inset shows 
hypocenters along the A-B cross-section, which obviously fall within the orogen. 
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In part 1 of this chapter, the EEDB system is considered as a GIS software 
(it was registered in Rospatent as GIS-EEDB program: No. 2011613755, 
13.05.2011), which was developed to meet the demand for a geographical base 
applicable to study spatial patterns of seismicity and geodynamics, as systems 
of this kind were absent or inaccessible at the time when the research began. 

The software’s geographical environment consists of several basic 
modules considered above. The program shell is at the same time a control 
system for the seismological database, and the content and format of all 
included earthquake catalogs are presented correspondingly. This part of our 
research was registered in 2009 in the State Register of Databases as an Expert 
Earthquakes Database. 

The geographical shell and the seismological DB are constantly being 
updated as new detailed geographical and earthquake data become available. 

On the other hand, while it was developed, the GIS EEDB system has 
changed from an information system for geographical visualization of the 
earthquake database to a high-tech expert system. As a result, the GIS EEDB 
now represents a set of research techniques, in addition to the historical 
catalogs and the graphic shell of the Earth. The application of the GIS 
approach [38] to studies of seismicity and geodynamics is illustrated with 
examples of data analysis. The reported techniques of multistage analysis 
using the criterion of concentration, earthquake clustering, and the recurrence 
curve slope use the most complete, reliable, and comprehensive information 
the system can provide on seismicity and its dynamics. To synthesize and 
interpret this information (to make an expertise), the experience, intuition and 
knowledge of the researcher are required. An obvious advantage of the 
presented system consists in the possibility for its continuous updating with 
the advance in methods and technologies. 

In conclusion, of Part 1, we note that the proposed version of the GIS- 
EEDB graphic shell, which develops the EEDB prototype first designed in 
FORTRAN and Pascal within the DOS operational system, has been written in 
Visual C++ and is maintained by Windows (NT, XP, Home, 7 and 8). It runs in 
a user-friendly interface, easily manageable even by inexperienced users, 
allowing them to review, visualize, and analyze data in a fast and effective 
way. The system has been successfully set into operation and has been in use 
for scientific purposes, being currently installed in a number of institutions, 
including IPGG (Novosibirsk) and Geophysical Surveys of the Russian 
Academy of Sciences. The software was used to study the seismicity of the 
Altai region [43, 44] and elsewhere, specifically, the areas of nucleation of 
large earthquakes. 
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Below we describe a modification of the GIS-EEDB high-tech expert 
system called Fukushima-EEDB and illustrate its use with examples for the 
seismicity of the Fukushima Prefecture area in Japan [5, 45]. 


PART 2. MODIFICATIONS AND VERSIONS OF EEDB FOR 
DIFFERENT GEODYNAMIC AREAS AND CASE STUDIES OF 
SEISMICITY ANOMALIES 


A. V. Mikheeva 


The EEDB subsystems are shown in the work flow chart of Figure 15. 

It is appropriate to apply the EEDB potentialities to study seismic activity 
in the Fukushima Prefecture which was exposed to the Great East Japanese 
Earthquake of M,, = 9.0 of 11.03.2011, called also the 2011 off the Pacific 
Coast Tohoku-oki Earthquake by JMA, or shortly the Tohoku earthquake. We 
have limited the study to the area within 36-39°N, 138-146°E. 

The adaptation and modification of the GIS-EEDB system to the local 
features of the area consists in. 


1) Creating an optimal database using the regional earthquake data; 
adapting the instrumental programs for processing the regional data 
and converting it to the EEDB seismological subsystem format; 

2) Developing a geographical system of regional EEDB version for 
qualitative visualization and analysis of seismic data from the region. 
Populating the EEDB geographical subsystem with detailed 
geographical data from the prefecture using ASTER GDEM (for the 
shaded relief model realized in EEDB) and the Natural Earth (for 
detailed cultural and physical layers in the vector and point formats); 

3) Adapting the methods and algorithms to statistical processing of 
seismic data in terms of local geodynamics. Applying comprehensive 
geoinformation analysis to learn the common patterns and anomalies 
of seismicity in the region. 


This preparatory work was carried out for the Fukushima Prefecture in the 
environment of supported applications to complement the resources of EEDB, 
and was followed by a preliminary seismicity analysis in the adapted 
Fukushima-EEDB system. At the same time, the development environment of 
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GIS EEDB was adapted to Windows 8, and its supported applications were 
translated to the standards of the Firefox versions and new versions of Global 
Mapper and FireFox utilities were employed. 

For the seismological base, we chose the Japan JMA catalog as the most 
complete one to date. When making the choice, we compared several Japanese 
catalogs: JUNEC (Japan University Network Earthquake Catalog, 1985/07/01- 
1998/12/31), NIED (National Research Institute for Earth Science and Disaster 
Prevention Earthquake Catalog, 1979/07/01 - 2003/06/30) and JMA (Japan 
Meteorological Agency Earthquake Catalog, 1926/01/01 — 2013/05/31). 
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Figure 15. Work flow chart for seismicity research and the framework of the 
EEDB system. 


Seismicity Pattern Visualized by Means of Fukushima-EEDB 


Simple visualization in a map (with sorting by magnitude) and selection 
(by bolting small events) show some geologically active structures in the area 
adjacent to the Fukushima Prefecture. First, they are seismic lineaments and 
super-lineaments. The seismicity of the area was assumed to be a part of the 
trans-regional Japan-Sea lineament [46] (black boundaries in Figure 16), 
which extends from the Arctic Ocean to the Philippine Sea, including the 
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submerged Lomonosov Ridge, New Siberian Islands, Sakhalin Island, 
Hokkaido and the north-eastern part of Honshu Island. The Japan-Sea 
lineament, in turn, is an element of the Antarctic super-lineament being a 
planetary meridian seam [46]. In addition, the Fukushima map images local 
seismic structures within the linear trend of the largest events for the past 
decade (No. 1 in Figure 16); a response to these events 200 km to the south 
(No. 3 in Figure 16); an inner zone of weak volcanic activity in the form of a 
ring within the location of Aizu-Wakamatsu (No. 2 in Figure 16); a deep 
active zone (No. 4 in Figure 16); and a swarm (No. 5 in Figure 16) near the 
Pacific coast on the Ibaraki-Fukushima prefectural border (a sequence of 
shallow normal-slip earthquakes [47]), including the M,=7.0 earthquake of 
April 11, 2011. This is an example visualization analysis of primary 
information on seismicity in a selected area, with the Fukushima-EEDB 
procedures (Figure 16). 

The revealed linear pattern of great earthquakes (M, >= 7), including the 
M = 9.0 Tohoku event, appears to be the most interesting result (No.1 in 
Figure 16; Figure 17a). In the map of Takashi NAKATA et al. [48], there are 
northeast- and north-striking tectonic elements in the area, which align with 
global structures. However, the orientations of some tectonic structures 
(stepwise displacement of an eroded anticlinal ridge, as well as a bend in the 
orientation of the Japan trench (Figure 17a) and faults along it, prompt the 
existence along 38°N of an ancient fault or another structure, which has shown 
up as a lineament with a high seismic potential through the past decade. 


Figure 16. M, >= 3.5 seismicity (13,742 events) near the Fukushima coast, since 2003. 
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Figure 17. The rigid linear structure (BB): a) tectonic geomorphological map [48] 
(thin- and heavy-line circles show events before and after the Tohoku event, 
respectively); b) P-wave tomography image [50]. 


Incorporating this tectonic map as a background map into the GIS EEDB 
system and visualizing great earthquakes (in the mode of drawing focal 
mechanisms) (Figure 17), we can see that the events of this chain are localized 
in a gap between elongate tectonic uplifts and belong to seismotectonic 
segments of different slip geometries (reverse, strike-slip, and normal): 


e first three earthquakes (left) in the chain, including the shallower 
foreshock of the Tohoku event located just to the north (Miyagi-oki 
event of 9.03.11) and the remote response in the southern part of the 
region, have compressive reverse-slip mechanisms; 

e the event in the middle has a strike-slip mechanism; 

e three events on the right have extensional normal-slip mechanisms. 


Frequent changes in geodynamic regimes is characteristic of the plate 
boundary separating the North American, Eurasian, Pacific and Okhotsk plates 
[49] and corresponds to a change in directions of major tectonic strain in this 
region. 

The profile AA (Figure 17a) striking along the main tectonic structures 
shows that the area between the Tohoku event and its foreshock was still 
weakly active after the event, though the center of the earthquake swarm had 
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shifted to the right of the main shock. This likewise suggests the presence of a 
rigid linear structure directed across the profile. The plate motion along this 
structure could trigger a cascade of destructive events at the plate edges. 

P-wave high-velocity zones in the area provide more evidence for the 
existence of a rigid structure along 38° N [50, 51] (Figure 17b): “The high-V 
patches in the megathrust zone may result from subducted oceanic ridges, 
seamounts and other topographic highs on the seafloor of the Pacific plate that 
become asperities, where the subducting Pacific plate and the overriding 
continental plate are strongly coupled” [52]. Furthermore, “a landward 
extending oceanic fracture zone controlling the slab morphology change 
around 38°N” was assumed there proceeding from the lateral slip distribution 
of the Tohoku event [53] and other evidence [53, 54]. 

As shown by the profile BB (Figure 17a) running along 38°N and across 
the regional linear structure (Figure 8a), the great earthquake chain is 
conformal to the junction between the Pacific and continental plates. The 
compressive earthquake mechanisms correspond to the Pacific plate and its 
convergence with the continent, while the normal-slip mechanisms represent 
subduction-related extension. The largest event is located in the place where 
the oceanic slab bends down and the plate contact is at the shallowest depth. 


N of events 
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Figure 18. The aftershock process of the Tohoku event in space (a) and time (b), 
obtained using the elliptic algorithm based on Prozorov’s method [29] (a). The box 
frames the study area. The curve (b) shows that the aftershock process is not over yet 
(blue and red bars correspond to the number of all and M, >= 2 events, respectively). 


Thus, we have considered an example of primary information analysis of 
seismicity by means of the GIS EEDB system using selection, sorting, and 
visualization. The visualization procedures include drawing focal mechanism 
solutions (beach-ball plots) in a map or in a cross section (with the vertical 
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projection of stereoplots in the latter case), using multi-layer coloring (in XOR 
mode) of asymmetrical quadrants calculated from parameters in catalogs of 
focal mechanisms; plotting focal mechanisms in the vector mode; selecting 
profiles and creating a relief profile and a cross section of seismicity in a 
selected visualization mode; loading whatever external raster map, as a 
background, into the EEDB environment; recognizing linear structures by a set 
of points distributed in space; etc. All the examples were shown using the 
functions from the first column of the flow chart (Figure 15), including the 
detection of aftershocks. 

The functions of detecting aftershocks implemented in EEDB are a set of 
different methods (empirical, elliptic, and interactive) and various 
modifications of the elliptic method [3, 24]. The efficiency of different 
algorithms for detecting aftershocks was evaluated by estimating the statistics 
remaining after filtering and comparing it with random Poisson’s exponential 
distribution. After removal of aftershocks by the elliptic method, only 14% of 
earthquakes have remained in the study area. We have estimated the number 
of aftershocks (using, also, the plot of Figure 18b) to be a few hundreds of 
thousands (303,640 shocks), and the aftershock process to continue, as the 
seismic background has not reached the average level observed before the 
event. Thus, we can draw an important conclusion that the greater part of 
earthquakes in this area belongs to triggered seismicity, rather than being 
independent; specifically, they are sequences of events induced by the Tohoku 
earthquake. Moreover, the spatial distribution of the Tohoku aftershocks 
(Figure 18a) has implications for the size of the active zone. Therefore, the 
territory adjacent to the Fukushima Prefecture can be considered a part of the 
nucleation area of the Tohoku event, which is suitable to retrospective search 
of indicators for earthquake prediction. Applying the procedure of removing 
earthquake swarms and aftershocks brings the seismicity parameters to the 
stationary baseline and thus significantly improves the quality of processing. 


Studying Seismicity Parameters Using the Analysis Subsystem of 
Fukushima-EEDB 


Investigation into seismicity anomalies can have different applications in 
seismic and geodynamic research. In this Part, we show how the analysis 
subsystem of Fukushima-EEDB is used for retrospective study of anomalies 
preceding a great earthquake, e.g., the Tohoku event. In addition to outlining 
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the nucleation area of the selected earthquake, it is important to estimate the 
quality of the selected catalog (Figure 15). 

First, one can find out how the earthquake parameters recorded by the 
seismological network change in time by plotting data from the selected JMA 
catalog in EEDB: 


1) The magnitude time series M,(t) shows (i) quality improvement of 
seismological networks in 1977 (for M, < 3) and 1987 (for M, < 1.5) 
evident in increased number of recorded earthquakes, 3 times on 
average in each year; and (ii) stabilization of the recording quality in 
the end of 1997 (for all magnitudes). 

2) The M(t) plot shows another step of improved recording (in 2002- 
2003), and an increased number of recorded events in the curve of a 
higher time resolution since mid-90s, followed by stabilization in 
recording quality. 

3) The Gutenberg-Richter N(M;) curve (least squares method) shows a 
linear segment in the magnitude range M = 3.5-6.5 corresponding to 
the K = 10-16 interval of the energy scale (Utsu’s [34] maximum 
likelihood thresholding methods). 


Thus, we conclude that the catalog completeness is the best within the 
interval of 2002-2013 while the magnitudes from 3.5 to 6.5 are the best 
representative. 

Seismic activity A [28] is the first characteristic of the seismic process 
explored by the cartographic method, which implies calculation of contour 
lines for the average values of the parameter on a regular spatial grid. The 
resulting map of the parameter A;; shows the mean long-term trend of seismic 
activity normalized to K = 15 (M, ~ 5.5), obtained from statistically uniform 
averaged seismicity data. The map (Figure 19) shows the peaks of the long- 
term seismic activity for the past 20 years in the area of Fukushima to follow 
the coastal line of the Fukushima Prefecture with a maximum near Hitachi 
City. 

To visualize the spatial distribution of the parameter b, another 
cartographic method is used: mapping parameter changes at uniform time 
intervals. In the “fill” mode of contour line visualization, the program 
performs spatial interpolation using the 2D Bessel function (1). The map 
(Figure 12 B, a) shows a poorly pronounced zone of concentrated negative 
anomalies which originates in the epicentral area of the pending Tohoku event 
and strikes along 38°N in 2007 through 2011 before the main shock. 
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The average b value has become 0.15 lower for the past decade (from 0.95 
to 0.8 according to Figure 12 B, b), with an error of ¢ = 3.3-2.9% according to 
formula (2). The stability of the estimate can be additionally checked by a 
standard deviation reaching 0.09-0.10 at the most significant points (Figure 12 
B, b) A b decrease is known [42] to indicate deformation and self- 
organization in the crust with stress buildup in rigid structures, which may 
occur during nucleation of a large earthquake. 
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Figure 19. M,=3.5-7 seismicity in 1994-2013; aftershocks are removed. The isolines 
(without coloring) show average frequencies of earthquake of a certain magnitude on a 
yearly scale. 


Relative total energy released in earthquakes per unit time, normalized to 
the background value of mean seismic energy and presented as log(Esun/Enorm)s 
is another seismicity parameter we study. 

This parameter, suggested by P.G. Dyadkov and Y.M. Kuznetsova [44], is 
advantageous due to its independence of the spatial seismicity pattern (taking 
into account the earthquake density in each cell is not required for the 
interpretation of results). The parameter highlights (Figure 20) the structure 
and dynamics of the quiescence zone formed two years before the Tohoku 
earthquake, since 2008-2009, north of the pending event. About a year before 
the main shock, quiescence in this zone gave way to weak foreshock activity, 


while another explicit quiescence zone appeared in 2010 west and south of the 
shock. 
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Figure 20. Relative seismic activity (white) and quiescence (gray) at one year intervals 
before the Tohoku event (grid size 0.2*0.3°, 3.5<M<6.5). Circle marks the pending 
Tohoku earthquake. 


This indicates that the mechanism of a “seismic gap” has set into action. 
This effect was described in literature, including in Japanese publications [55], 
as a basic model predicting many large earthquakes. Moreover, the epicenter 
falls on the border of the quiescence zone, where the gradient between high 
and low relative total energy is the largest (rightmost map in Figure 20). This 
confirms our earlier idea [56] inferred from data of the Baikal rift zone that the 
largest earthquakes occur on the edge of negative seismicity anomalies at the 
points of maximum gradient of relative total energy. The gradient map clearly 
shows distinct positive anomalies between the pending Tohoku event and its 
foreshock, as well as in the area where the earthquake of April 11, 2011 was to 
occur on the Ibaraki-Fukushima coast (Figure 21b). The gradient calculation 
procedure we are using searches the maximum relative energy difference 
between the current and neighbor cells. 


8.3.1980 - 8.3.2011 a. 9.3.2010 - 9.3.2011 b. 


Figure 21. The map of relative energy gradients: a) for background seismicity; b) for 
seismic activity over a year prior to the Tohoku event (grid size 0.1x0.2°, 3.5<M<6.5; 
aftershocks are removed). Circles and white arrows mark the location of pending 
Tohoku earthquake (1), its Miyagi-oki foreshock (2), and an earthquake of 11.04.11 
that occurred a month after Tohoku (3). 


Geoinformation Systems for Studying Seismicity ... 187 


Density of seismogenic fractures or the concentration criterion Kar is a 
perfectly physical parameter based on rock mechanics which describes the 
seismic process in terms of fracture of solids. The respective theory was 
developed by S. Zhurkov [57], and then A. Zavjalov proposed to use it for 
earthquake prediction and estimated the critical value of the fracture density 
parameter for the Kamchatka region [35]. P.G. Dyadkov proposed to use this 
parameter for characterizing seismicity in a different way [3], namely, to 
estimate the duration of seismic stability preceding a large event from 
flattening of the cumulative K,,, plot relative to the ideal curve of uniform 
fracture growth, as it was made for the Altai (Chuya) earthquake of 2003 
(Figure 7). 

In the case of the Tohoku earthquake, the K,,, curve also became flatter 6 
years before the main shock, all over the nucleation area (Figure 22a). The 
mapped K,,, patterns (Figure 22b) reveals the BB structure (Figures 16 and 
17), as well as expansion of fractures toward the area of the Tohoku-event and 
its foreshock. 
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Figure 22. Variations of fracture density (K,,, parameter): in time (a) and in space (b). 
Black and gray colors in (a) show, respectively, real data and data corresponding to 
uniform fracture growth; in (b) the grid size is 0.3 x 0.5°, 3.5 < M, < 9, Hmax= 50 km. 


Thus, the application of the EEDB tools to seismicity studies in the area 
around the Fukushima Prefecture allows the following inferences: 


1) There exists a rigid linear structure striking along 38°N orthogonal to 
the Japan Trench and other regional tectonic lineaments; 

2) The hypothesis that the inward motion of this structure on 09.03.11 as 
a result of plate boundary slip could trigger a cascade of destructive 
events of 11.03.11 along its edge has been proven valid; 
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3) Inspection of the JMA catalog has shown that its completeness is the 
best within the interval of 2002-2013 and the earthquake magnitudes 
from 3.5 to 6.5 are the best representative; 

4) The map of long-term seismic activity normalized to K = 15 (energy 
scale) shows the greatest activity in the the Japan offshore area for the 
past 20 years. It is next to the Fukushima Prefecture coast, with the 
maximum near Hitachi city. 

5) The medium-term and long-term anomalies appeared before the 
Tohoku event in several seismicity parameters, including 1) weakly 
pronounced decrease in b for the past decade prior to the event; 2) 
flattening of the K,,, (fracture density) curve for the past 6 years; 3) 
formation of quiescence zones for the past 2 years; 4) prominent 
positive anomalies of relative energy gradient formed a year before 
the shock. 


All these features may indicate seismic stabilization in the nucleation area 
preceding the Tohoku great earthquake. 


PART 3. VISUALIZATION AND ANALYSIS OF EISC-DATA 


A. V. Mikheeva 


The catalog of the Earth’s Impact structures, available at the website of 
ICM&MG [1, 6, 58] was created by Anna Mikheeva in 2005 based on 
different reference data: published evidence (papers, books), Abstract 
Journals, data deposited in VINITI (section “Geology and Geophysics"), as 
well as personal communications. As a result, the first Russian Internet catalog 
and a Complete Database of all observed, probable, potential, and even 
erroneously inferred structures of extraterrestrial origin have been compiled. 
There are fields with key parameters indicating the location of footprints left 
by an impact or an explosion of a cosmic body (CB) on the Earth’s surface: 
event name (and/or geographical location, territory), continent or ocean code, 
underwater (*) and comet (‘) marks, the validity index according to 5-grade 
probability scale (from 0 to 4, Figure 23a), coordinates and geometrical size of 
the structures (latitude, longitude, diameter), age (or date) (Figure 23a), the 
relevant text and graphical information (as hyperlinks), and many other fields 
in the extended version (depth, number of objects, erosion rate, gravity and 
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magnetic anomalies, etc.). To date, the Earth’s Impact Structure Catalog 
(EISC) has been one of the most complete published catalogs of this kind, with 
2141 records. It is currently being used by many researchers and is open for 
updating. It is convenient to study the space patterns of impact structures and 
to analyze their parameters with an independent version of the control and 
visualization system (the EISC system). By applying the EEDB software to the 
EISC catalog, one can select a sample of impact structures from the original 
Catalog according to different parameters (diameter, validity, etc.) and 
working areas of different scales from global to local maps, and then obtain 
the related cartographic information, including the locations of impact craters 
(Figure 23b), geophysical and geological layers, etc. In addition to the map 
visualization, the system can list the catalog in the text format, plot different 
parameters, and show results of statistical data processing. The mathematical 
support and the software of the EISC system allow plotting frequency 
distributions of crater diameters (logarithmically proportional to impact 
energy) for events from various samples, as well as different distributions of 
integrated parameters with time, space, and with respect to one another. The 
curves shown in Figure 24 image log-log frequency distributions of craters of 
different diameters and rms deviation of the random distributions from the 
regression line (variance S). The curve in Figure 24a shows abrupt changes at 
lgD = 0.7 (D~5 km). When the curves for D > 5 km, 1 km < D <5 km and D < 
1 km are plotted separately, irregular distributions appear in two latter plots. It 
means that the data available for D < 5 km craters is incomplete, because the 
ancient surface structures may have been modified by erosion or sedimentation 
or, they may be poorly studied, as high-resolution surveys cover only a lesser 
part of the globe. The crater diameters are unevenly distributed in time (Figure 
25), with age constraints missing for a half of events (50% of craters), which 
all are tentatively placed in the end of the time scale. The curves of Figure 26 
show progressive growth in the number of discovered craters, the number of 
really discovered structures departing markedly from the exponential 
distribution N~e7***0'?" predicted in the 1970s [60]. The updated time 
dependence of the number of real discoveries is rather nonlinear quadratic, 
obtained based on analysis of EISC catalogs: 


N~10-17 +13--11, 
The published catalog has been used also to plot the size of craters against 


their age, which has implications for the relaxation time of impact structures. 
This plot is similar to the one published in [60] and [61] and is not shown here. 


190 A. V. Mikheeva, An. G. Marchuk and P. G. Dyadkov 


Another relationship, Depth (Dep) vs. Diameter (D), describing the crater 
geometry appears to be more interesting (Figure 27). The averaging line of all 
points independent of target rocks is described by the formula: 


Dep = 66.25 D?” 


The Complete Catalog of the Earth's Impact structures 


by Anna Mikheeva, ICM&MG SB RAS. 
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Figure 23. Website views of ICM&MG site “The Complete Catalog of the Earth’s 
Impact Structures” in the table and map forms: a) catalog format of impact events; 

b) an EISC GIS-system interactive map of catalogued events 
(http:/abmpg.sscc.ru/impact/karta1.html) according to their sizes (crater diameter D) 
and validity (Val). The bottom panel is a map fragment with giant impact craters 
(D>1000 km) from [59]. 
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Figure 24. Recurrence of historic impact events, which left craters of different 
diameters: D > 1km (a); D > 5km (b); 1km < D < 5km (c) and D < 1km. Only 873 
events with D > 5km (b) show regular patterns. 
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Figure 25. Time series of crater diameters (Ma). Color shows the reliability (validity) 
of structures. 
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Figure 26. Progressive growth of the number of discovered craters: “ideal” is expected 
growth, “all” is the total of events in the EISC catalog, “real” refers to proven and 
probable craters (grades 0 and 1). 
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Figure 27. Geometry of impact craters. Lines 1-3 correspond to known depth-diameter 
relationships of craters for different target rocks [59]: 1) Dep~159-D°*” for D<3.8 km 
in crystalline rocks and D<1.2 km in sediments; 2) Dep~52-D°'® for D>4 km in 
crystalline rocks; 3) Dep~204-D°”” for D>2.5 km in sediments. Dark color shows 
proven structures. 


Information Technologies and Methods Used to Study the Shape 
of Craters 


We study the geometry of craters using the shaded relief model and digital 
mapping. Typical morphological elements of impact structures have been 
systematized and can be used as diagnostic features [7, 62]. 

The basic hypothesis of the impact-explosive tectonics [63] is that 
meteorite craters on the Earth should be as frequent as on the Moon or on the 
Mars. To see how many large (D>>100 km) ring structures (RS) are there on 
the Earth, one can examine respective geological maps based on satellite 
imagery [64, 65] (Figure 28a). The use of processing algorithms for digital 
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elevation modeling [66] to find RS on the Earth’s surface may be effective in 
the beginning of search for new craters, which are often difficult to detect in 
relief models. 

However, these methods are insufficient to identify the origin of the 
detected RS. Discriminating the craters of potentially impact origin among 
many RS requires new diagnostic criteria associated with typical 
morphological elements revealed with advanced image processing 
technologies. 


Figure 28. A fragment of a satellite map of the Ust’-Kamenogorsk area: a 
cosmogeological map [64] (a) and an EISC GIS map [6] (b). Symbols stand for: faults 


(1), arched structures (2), RS of uncertain or complex origin (3). 


Cosmogenic ring structures (CRSs) can be successfully identified in 
tectonically stable areas (on Precambrian cratons and shields) [61] where they 
are well preserved due to the absence of magmatism and thick sediments. 
However, judging by geological evidence, almost all proven astroblemes in 
Russia described by V. Masaitis et al. [67] are buried under sediments or 
submerged, except for several craters (Logancha, Beyenchime-Salaatin) that 
partially remain on the surface and are detectable by morphological analysis of 
aerial photographs. Nevertheless, as real data have shown [1], the original 
cosmogenic terrain has been perfectly preserved in many areas of the Earth, 
for example, in Lake Balkhash surroundings, in Rudny Altai, Kola Peninsula, 
Mexico, Madagascar, and South Africa. Note that many impact craters have 
been identified on the Moon and other planets for the past century based 
exclusively on morphological criteria [68]. Thus, the morphological elements 
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of impact structures may be, in certain conditions, the basic diagnostic criteria 
of impact origin, superior over petrographic and mineralogical proofs. 

In particular, our diagnostic methods based on the geometry of impact 
craters have allowed us to discover and add to the Catalog more than 40 
potential astroblemes in Rudny Altai [1], as well as to better constrain the 
genesis of ten craters in Madagascar, Northern Italy, and Siberia. The impact 
craters identified only according to morphology are assigned validity grade 2 
in the Catalog [1], which means “potential craters” in the five-grade (0 to 4) 
scale of I. Zotkin and V. Tsvetkov [69]. 

We study the morphology of impact structures by shaded relief modeling, 
using NASA data arrays of SRTM (Shuttle Radar Topography Mission) and 
ASTER GDEM (Global Digital Elevation Model) (Figure 29), with the digital 
mapping technology as described above. 


Figure 29. Figure 29. Probable astrobleme Volchikhinskaya: a) GIS-EISC shaded relief 
model (ASTER GDEM data), b) lower-resolution shaded relief model (SRTM data). 


The new method of detecting impact craters consists in selecting optimum 
foreshortening of an image, or illumination parameters and shadow depth, to 
fill the gaps caused by modification of impact craters affected by erosion, 
deposition, tectonism or volcanism. 

This image processing procedure allows detecting RS in a series of 
elevation models, as well as collecting evidence for standard elements 
diagnostic of CRSs. The procedure of CRSs identification includes several 
steps [61]: 
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1) selecting typical morphological models for possible CRS of the 
region; 

2) selecting diagnostic criteria to prove extraterrestrial genesis of RS; 

3) comparing the RS in point with the proven cosmogenic structures; 

4) revealing contrasts of the discovered elements with the surrounding 
terrain. 


To investigate the crater-related landforms, we use Google Earth satellite 
images in addition to the shaded relief model. In this study we refer only to the 
impact structures from EISC [1] that are in good conditions, with undisturbed 
craters located in tectonically stable areas (cratons or shields), with minimum 
of magmatism and sedimentary cover. 

The approach was applied earlier to Rudny Altai, where the primary 
cosmogenic terrain has been perfectly preserved, and led to discovery of 
morphological elements of potential astroblemes (later confirmed in other 
regions as well), for example [62]: raised rims, shadow of central cavity, 
“braces” (called “bank ridge”, “central impact cone”, and “stiffening ribs“, 
respectively, in [62]),and mini-craters. 


Post-Impact Environmental Effects on the Crater Shape 


Identification of CRSs may be problematic because post-impact 
environmental effects (e.g., erosion) can distort the proportions of the crater 
elements. However, according to evidence from many regions with preserved 
original cosmogenic terrain [1], erosion is less destructive for large craters (D 
> 1 km) than subsequent impact events or tectonic activity. For instance, the 
India-Eurasia collision, which has produced the great Himalayas—Tien-Shan— 
Altai-Sayan collisional system, can have “milled" many older astroblemes. 
Therefore, in identifying an impact crater one has to be aware of its possible 
modification by later tectonic movements (e.g., the case of Sudbury [1, 61]). 

We [62] revealed such tectonic effects on the morphology of the potential 
Madagascar 1-5 impact structures discovered in 2006 by Matteo Chinellato 
(Tessera, Venetia, Italy). With the above method, we have identified both the 
main morphological elements and the post-impact tectonic evolution of 
Madagascar 1 [1] (D = 290 km) (Figures 30 a-c). Namely, we recognized the 
other half of the Madagascar-1 giant crater on the African plate [62] according 
to its typical morphological elements of a depression, raised rims (Figure 30a) 
and a central peak (Figure 30c), besides the elements of RS deciphering [70]. 
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Figure 30. Contours of the Madagascar-1 crater ina Google Earth map [62] (a), a GIS- 
EISC digital elevation model(b), and in cross sections along profiles A and B (b) 
showing the entire crater ring (in GIS-EISC) (c). 


Following the coastline contours (Figures 30: a-b) one can trace a 
probable location of the Africa-Madagascar junction (separated no earlier than 
150 Ma). The fault line (underwater line in Figure 30b) breaking the RS in half 
is parallel to the major axis of the structure and, possibly, to the ballistic 
trajectory of a falling cosmic body (the direction of impact). The Madagascar 
half of the crater is shifted, likely due to velocity difference of the Madagascar 
terrane relative to the African plate. The seafloor motion direction is 
confirmed by the map of linear magnetic anomalies for the adjacent spreading 
zone [62, 71]. 

Thus, the new approach, with the updated morphological diagnostic 
criteria revealed by GIS technology and remote sensing, allows identifying 
impact craters even if they have been affected by active geological processes. 

The potentialities of the EISC and EEDB GIS systems have been 
combined into a single version of GIS ENDDB, using gravimetric data 
uploaded into the GIS Database for both seismological and impact cratering 
applications. 
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PART 4. GRAVITY DATA AND NEW APPLICATIONS 
OF ENDDB 


A. V. Mikheeva and An. G. Marchuk 


The ENDDB system has acquired new applications as it processes satellite 
gravity and high-resolution topography survey remote sensing data. This 
allows verifying recently found diagnostic morphological elements of impact 
structures and revealing geomorphic patterns of seismic structures, using the 
respective EISC [1] and earthquake catalogs. With our complete global catalog 
EISC [1], one can identify typical persistent elements of impact structures, 
compare them, and estimate their diagnostic validity. 


ENDDB System: Main Features and Geoinformation 
Technologies 


The subject database of the ENDDB system is a combination of the EISC 
catalog and seismological data of more than 60 earthquake catalogs. 
Mathematical methods of catalog studies from GIS EEDB allow visualizing 
samples of the EISC catalog in a pseudo-3D background map according to the 
legend, or in the map scale. 

ENDDB uses the NASA ASTER GDEM data arrays to obtain a high- 
resolution (1 arc-second) shaded relief model, as well as the digital mapping 
technology, which consists in shading surface points according to their 
brightness controlled by the illumination angle. A special technique has been 
developed to add fragments of ASTER GDEM open-file data into the ENDDB 
environment. This special operation is necessary because simple introducing 
of a single global file with high-resolution topographic data into ENDDB 
would be unfeasible (the system size of such file should be 1.62*10" bytes). 
Incorporating the high-resolution data for an area of impact or seismic 
structures takes only a few minutes and consists in downloading the selected 
geographic area files from Internet (or Archive), converting their raw formats 
to ASCII by Global Map, with subsequent conversion of the ASCII file to the 
ENDDB format using a specially designed converter and the corresponding 
changes in the text file describing the external arrays. Besides these 
incorporated fragments, ENDDB stores elevation data of the same resolution 
as in EEDB (see above). 
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For modeling a shaded gravity anomaly with the ENDDB tools, Global 
marine gravity data (of models V16.1 and V18.1 [72]) are embedded into the 
system. These models, which are arrays of gravity pixel values, are of the 
same size (resolution), but V16.1 gives the maximum resolution only for the 
marine global map, while V18.1 includes also the land and can update data for 
coastal areas by interpolation. The resulting resolution of the V18.1 model 
becomes uneven only latitudinally because the original (Mercator) projection 
distorts the cells, and we transform it into a rectangular (i.e. transverse 
cylindrical conformal) projection. As a result, the resolution increases from the 
equator to the poles, being 30 arc-seconds per point on average, which is the 
same as in the more recent V21.1 model. 

The gravity data sources for all these models are from the ERS-1 and 
Geosat/GM missions, as well as the recently published EGM-2008 global 
gravity model [72]. 

Identifying impact craters by ENDDB begins with selecting the optimum 
base colors of the image, the parameters of illumination and shadow depth 
[62] for shaded modeling on a regular grid. This procedure allows obtaining 
precise 3D images of the terrain and gravity patterns, and, moreover, furnishes 
data for recognizing standard morphological elements diagnostic of impact 
structures. 


Typical Elements of Impact Craters Identified in GIS Digital 
Models 


In addition to the elements reported in [7, 62], the EISC catalog [1] 
includes other new morphological elements typical of impact structures, which 
are expressed in the shaded elevation and gravity models and identified using 
the ENDDB visualization tools: tail-shaped asymmetry, heart-shaped 
geometry of craters, and tail-shaped gravity lows. 

The tail-shaped crater asymmetry is an elongate topographic low 
accompanying the ring depression of the main crater (of a similar or even 
lesser expression) (Figure 31a-b). 

This asymmetry was observed in eight reliably proven and four probable 
impact craters, as well as in seven potential and two questionable structures. 
To assess the diagnostic validity of this element, one has to take into account 
the astrobleme relaxation associated with the sedimentation rate and duration, 
besides the destructive effects of erosion, tectonism, volcanism, and later 
meteoritic activity [62]. For example, for the 455 Ma Lockne crater [1, 75], 
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preference is given to geologically expressed tail-shaped anomalies rather than 
to its topography because the impact-produced topographic lows have been 
filled with sediments, and the "tails" expressed in the modern surface 
topography (a lake contour) strike in a different direction. 


Figure 31. Tail-shaped asymmetry of the Logancha impact crater expressed in Google 
Earth satellite images (a), the ENDDB elevation model (b), and in the map of gravity 
anomalies accommodating the Logancha astrobleme obtained according to [72] (c). 
The scale shows gravity in mGal. 


On the other hand, the probability of tail-shaped anomalies in an impact 
structure may depend on the kinematics of crater formation: the speed of the 
cosmic body (CB) and angle of its entry into the atmosphere. Particularly, for 
the Logancha crater [1] (Figure 31), the CB entry at a relatively low angle 
produced another morphological feature, a sort of braces on the frontal outer 
side, with gaps between them filled with recent sediments. More geomorphic 
evidence for the low-angle CB trajectory may come from its direction to a 
chain of minor craters or from the frontal part of the crater being prominent 
against the surrounding terrain, i.e. a shoe-shaped crater rim (e.g., in the 
Erofeev crater [1], Figure 32). 

Some tails have a bending geometry (e.g., the Karikkoselké, Möckeln, 
Korpinen, and Lasnamie craters [1]). If they were produced by an energy 
(gravitational) influence of a cosmic body [73], the latter would appear to have 
"maneuvered" before falling. The same bends were observed also in the tail- 
shaped zones of craters imaged in gravity anomaly maps [73]. They may result 
either from the original gravity (density) heterogeneity of the target rocks, or 
from density decrease by explosion-induced brecciation [74]. The latter 
explanation is especially relevant to tails with prominent concentric anomalies 
inside [73]. The brecciation and fracture of rocks are caused by shock waves 
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from air-gas explosions associated with the CB motion through the atmosphere 
to the surface. However, the bending tail-shaped depressions rather suggest 
gradual destruction of the body on its way through the atmosphere, which 
forms a train of debris behind the body (the tail) thus elongating the impacted 
area of land. 
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Figure 32. A shoe-shaped crater form and a tail-shaped gravity low of the Erofeev 
structure (D=10 km): a) the astrobleme relief according to ENDDB, b) gravity 
anomalies according to ENDDB. Note the resolution difference between the elevation 
and gravity data. The scale shows gravity in mGal. 


Figure 33. Tail-shaped relief anomalies and heart-like geometry of craters, according 
to ENDDB: a) Volchihinskaya probable structure, b) Chalkar- Yega-Kara yet 
questionable structure, c) Heart of Hindustan potential structure [1]. d) Gravity 
anomalies of Heart of Hindustan crater, according to [72]. The scale shows gravity in 
mGal. 
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Figure 34. A heart-shaped crater of the Qinghai Lake probable structure (D=60 km) [1] 
according to ENDDB: a) relief according to ENDDB, b) tail-shaped gravity anomalies 
according to [72]. Dark color shows gravity lows. 


A heart-shaped geometry is another morphological feature of craters, in 
addition to the tail-shaped asymmetry, likewise identified according to EISC 
[1]. This geometry is quite common to impact structures in the catalog 
(Figures 33, 34); some heart-shaped structures also have larger or smaller 
“tails” (e.g., the Lasnamäe crater [1]). 


Figure 35. Tail-shaped gravity low according to ENDDB, accompanying potential 
astroblemes: a) Ladoga (D=80 km, 0.0385 Ma) and Onega (D=125 km, 0.0385 Ma), b) 
Kurai Basin (D=21,5 km, 34-200 Ma), c) Jeskazgan (D=100 km). Arrows in (a) show 
the direction of CB trajectory, determined from tail-shaped forms filled with water 
(black) and from gravity lows [72] (white). The scale shows gravity in mGal. 
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Earlier [62] we explained the formation of such crater landforms by 
superposition of three impact structures of different diameters with a common 
rim produced by an originally single falling body, which broke down but did 
not disperse into pieces. This asymmetry, even in the absence of a tail, is a 
reliable indicator of the CB trajectory (Figures 33, 34). 

Both the 3D elevation model of ENDDB (Figures 31-34) and Google 
Earth 3D satellite images can highlight the morphological elements of impact 
structures at the optimum image foreshortening, or if the depressions are filled 
with water or thick vegetation (Figure 31a) [1]. 

Another diagnostic feature found in hundreds of craters from the EISC- 
catalog using ENDDB is associated with gravity: tail-shaped gravity lows [73] 
that accompany large astroblemes (Figures 31c; 32b; 33d; 34b). Assuming that 
the gravity lows and the tail-shaped asymmetry in craters are of the same 
origin, one may expect these features to appear in couple. Note, however, that 
such comparison is possible only for relatively large craters (D >>15 km), the 
resolution of the available gravity data being much inferior to that of the 
elevation models (~ 30 sec per point in V 21.1 against 1 sec in ASTER 
GDEM). At the same time, the estimates of CB arrival direction from different 
morphological features may be ambiguous for individual structures [75]. 

For example, all three indicators we describe give similar azimuth 
estimates of ~ 300° in the case of the Qinghai structure (Figure 34) but show a 
30-35° difference for the proven Wanapitei and Popigai craters [7, 73, and 75]. 
The variations in CB arrival azimuths estimated from tail-shaped asymmetry 
and gravity lows for the potential Ladoga and Onega structures (Figure 35) are 
25° and 5°, respectively (the contours of the lakes are colored white and show 
a tail-shaped asymmetry). 

We have checked the diagnostic value of tail-shaped negative gravity 
anomalies with craters in Russia using the Gravity maps 2010, scale 
1:2500,000 and found this feature in all large craters produced by bodies for 
which we can assume a trajectory with a relatively low angle to the Earth’s 
surface [73]. However, large proven structures (D > 15 km) are quite few in 
Russia (only 9), and it is important to check this pattern on a global scale. 
Indeed, the gravity imprints of CB trajectories show up in the new shaded 
model of “Global marine gravity” for hundreds of astroblemes (the data are 
available at the website [1]). Furthermore, gravity as part of the GIS-ENDDB 
system can be useful to prove the impact origin of many less certain structures 
(Figure 35), such as submerged or small island structures (where a small island 
is a part of a crater). 
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Figure 36. Tail-shaped gravity lows according to ENDDB that accompany potential 
underwater astroblemes: a) Krk (D=14 km, 40.4 Ma), b) Tyrrhenian Basin (D=200 km, 
0.01 Ma), c) Bickerton Island (D=30 km), d) Athos (the observation point of 
2003.07.21 is asterisked), e) Galapagos (D=14 km). Images (a) and (c) are obtained 
with V15.1; the other maps of this and previous figures are obtained with V18.1 [72]. 
The scale shows gravity in mGal. 


Visual observation of submerged craters is difficult, and analysis of 
geophysical evidence in this case is simpler than the analysis of morphology. 
The surface gravity anomalies mimic the round shapes of well-preserved 
craters, which can be assigned to impact structures in the presence of a tail 
(Figure 36) even if no gravity data is available to reveal rootless anomalies. 
This assignment may be the first step in a complex study of submerged impact 
structures, when in addition to the standard geological and geophysical 
mapping methods there are such exotic ones as paleographical reconstructions 
of tsunami waves with location of the impact origin, or, for example, location 
of ring clouds and zones of oceanic heat flow anomalies [61]. 


Typical Structural Elements of Seismicity Identified in GIS 
Digital Models 


The ENDDB (as well as its prototype GIS EEDB) software offers special 
methods for grouping related earthquakes: grouping earthquakes generated by 
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spatially proximal active faults, e.g., plate boundaries or seismic blocks (the 
groups can be revealed in the same way as detecting aftershocks, swarms, or 
clusters of earthquakes; spatial pattern recognition; migration of seismicity 
center, etc. [76]). 


Figure 37. Gravity anomaly according to ENDDB along the boundaries of seismic 
blocks delineated by faults (black lines) and seismic lineaments (white bands 
according to [77]) in the northern Baikal Rift Zone. Dark color shows gravity highs or 
shadow. 


For example, the algorithm for recognition of linear patterns (images) by a 
set of points distributed in space can identify some real straight and curved 
tectonic structures [78]. The algorithm implies setting up the maximum 
stepsize and deflection angle to find the next point in the chain of events on a 
seismic border. Thus, identified structures can be confirmed by geophysical 
data. Furthermore, the ENDDB geological-geophysical database contains 
layers of seismic lineaments, faults and trenches, which can indicate whether a 
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region under study consists of rigid (seismically inactive) or soft structures 
(Figure 37). Such comparison is necessary both to model a detailed crust 
structure by detecting seismic block boundaries or individual faults, and to 
study regional seismicity. 

Another example shows how a gravity map can be used to detect the 
pattern of aseismic or weak seismic zones (Figures 38a), which have 
implications for interaction of rigid tectonic structures with orogens. This 
interaction was visualized using a special EEDB method suggested by P. 
Dyadkov for analyzing seismicity anomalies (activity and quiescence) 
proceeding from indicators of rigid zones (Figures 14, 38b-c). 


Figure 38. Spatial distribution of aseismic and weak seismic zones: a) gravity 
anomalies according to [72] and ENDDB tools (dark color shows gravity lows); b) 
distribution of average annual total energy for 1972-1998 taken as the norm; c) maps 
of aseismic areas (black) and zones of seismic quiescence (gray) in Central Asia, at 
every 4 years [24]. The maps were compiled using data of the combined earthquake 
catalog COMPLEX (M > 4, 1972-2000, 24-46° N; 70-102° E). 


In conclusion, we illustrate the application of the method to the rigid 
structure previously identified (Figures 16-18, 22) by the seismicity analysis of 
Japan area along 38°N. The gravity map and vertical cross section AA show a 
prominent high in the epicenter of the pending Tohoku earthquake (Figure 
39). 

The examples of Figures 37, 39, as well as other evidence of relationship 
between the gravity and seismicity patterns [79], show that earthquakes are 
generally confined to prominent features of the gravity field. They are, for 
instance, regional and local gravity gradients at boundaries between blocks 
with different physical and structural properties, including active seismic 
boundaries. Note that regional features of the gravity field of a deep (most 
likely mantle) origin correspond to main structural elements of a zone bounded 
by seismic belts [79]. This determines the prospects of our future research. 
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Figure 39. a) Gravity anomaly map of Tohoku earthquake area according to [72] and 
ENDDB tools. b) The vertical section AA of the gravity anomaly. White circle (a) and 
black arrow (b) show the Tohoku epicenter. The scale shows gravity in mGal. 


CONCLUSION 


The geoinformation system ENDDB (Earth’s Natural Disasters Database) 
is an important tool to study natural disasters, such as earthquakes or impact 
events, using records of the respective catalogs. The GIS system has a user- 
friendly easy-to-run interface, detailed geographical databases for the whole 
Earth and its regions, and ample databases of seismological and impact 
structure parameters. With its mathematical support, ENDDB can plot 
frequency dependences of magnitudes or sizes (crater diameters) of events 
from various samples, as well as other distributions of integrated parameters in 
time and space, or with respect to one another. Necessity of using geological 
and geophysical parameters (gravity field, faults, etc.) for analysis of data from 
both EC and EISC (Catalogs of earthquakes and Earth’s impact structures), 
has been a prerequisite for creating the combined system of ENDDB from its 
two prototypes, GIS EEDB (Expert earthquake database) and GIS EISC. 

Introducing the gravity information into ENDDB expands largely its 
application; specifically, it has made possible detecting new morphological 
features characteristic of impact structures. It is especially important to find 
additional morphological indicators for the lack of absolutely reliable 
diagnostic features of extraterrestrial origin, even for the structures whose 
impact nature has been confirmed by rich shock-explosive evidence. This lack 
of certainty has given reason to the opponents of the impact genesis of ring 
structures to continue vigorous debates in the literature. Note that the 
asymmetry of different geomorphic characteristics of impact craters, and, in 
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particular, the tail-like shapes of related geological and geophysical fields or 
morphological anomalies (including the present-day topography), allows 
discriminating impact structures from ring structures of magmatic origin, to a 
large degree of confidence. 

The use of gravity information is important for some seismological 
objectives as well, specifically, for identifying seismic blocks, lineaments, and 
other seismic-morphological structures revealed by means of GIS-ENDDB 
visualization and mathematical tools during analysis of the spatial patterns of 
seismicity. 

An obvious advantage of the presented system consists in the possibility 
for its continuous updating with the advance in methods and technologies. 

The authors are grateful to I.I. Kalinnikov (Institute of Physics of the 
Earth RAS, Moscow), K.K. Khazanovich—Wulff (Planetology Department of 
the Russian Geographical Society, St Petersburg) for useful discussions and 
ideas. 
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ABSTRACT 


Water quality evaluation is an overall process of evaluating physical, 
chemical and biological nature of water in relation to natural quality, 
human effects and intended uses particularly uses which may affect 
human health and the health of the ecosystem itself. Interpretation of 
enormous water quality data in a convenient manner for visual inspection 
is an important but often underestimated or omitted step in a water quality 
evaluation program. Recently, need of modern approaches and tools for 
interpreting water quality is emphasized for efficient water quality 
management. Geographic Information System (GIS), with an ability of 
capturing, storing, analyzing, manipulating, retrieving and displaying 
spatial data, has emerged as a powerful tool for decision-making in 
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several areas including environmental field. This chapter aims at 
highlighting the role of GIS in synthesising, compiling, presenting and 
interpreting chemical data of both surface and ground waters. Firstly, few 
relevant fundamental terms and process of water quality evaluation are 
defined and/or described. Thereafter, the chapter contains theoretical 
procedure for applying GIS to assess spatial change or variability in water 
quality by characterizing extent and patterns of contamination. In general, 
a water quality monitoring network consists of a group of point locations 
with known chemical attributes of water. GIS helps converting the point 
values into areal information through spatial interpolation. 

Hence, an overview of spatial interpolation techniques is provided, 
together with the methodologies for employing geostatistical modelling 
(kriging) and inverse distance weighting techniques and for computing 
spatial statistics (mean, median, standard deviation and coefficient of 
variation). The major application of GIS in past groundwater studies has 
been for assessing groundwater vulnerability. 

Therefore, the concept of groundwater vulnerability along with its 
historical perspective is described and different GIS-based overlay and 
index methods used for groundwater vulnerability assessment are 
summarized. Methodologies for applying different GIS methods in 
evaluating the groundwater vulnerability are illustrated through 
flowcharts. 

The major tools for describing groundwater vulnerability in GIS 
framework include DRASTIC, modified DRASTIC, DRAMIC, GOD, 
AVI, SINTACS, EPIK, GLA, PI and COP. 

Furthermore, the development of GIS-based water quality index for 
evaluating water quality is discussed. Finally, combined use of GIS and 
multivariate statistical analysis techniques in delineating water quality 
zones is discussed. It is concluded that GIS is a promising geospatial tool 
which offers efficient framework for sustainable management of 
freshwater resources. 


1. INTRODUCTION 


Water quality is governed by a set of complex factors and there is large 


choice of variables use to describe water quality status in quantitative terms. 


composition and state of aquatic biota in the waterbody, and (c) description of 
temporal and spatial variations due to factors internal and external to the 


Hence, it is difficult to provide a simple definition for water quality. Water 
quality of the aquatic environment is defined by (a) set of concentrations, 
speciation, and physical partitions of inorganic or organic substances, (b) 


waterbody (Meybeck and Helmer, 1992). 
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Water quality is also defined as a consequence of natural physical and 
chemical state of water (surface or subsurface) as well as alterations caused by 
human activities (Fetter, 1994). The quality of water is a measure of its 
suitability as a water supply source for domestic and agricultural consumption 
as well as for irrigation, industrial and other purposes; the suitability of water 
is decided based on criteria for various uses and water quality standards. The 
definition of water quality is therefore not objective; rather it is socially 
defined depending on the desired use of water. Different water uses require 
different standards of water quality and water quality criteria define desirable 
characteristics and acceptable levels of constituents for water of various 
intended uses (Freeze and Cherry, 1979; Todd, 1980; McCutcheon et al., 
1993; Fetter, 1994). To establish quality criteria, the measures of physical, 
chemical, and biological constituents must be specified, together with standard 
methods for comparing results of water quality analyses (Todd, 1980; 
McCutcheon et al., 1993). The pollution of the aquatic environment can be 
defined as introduction of substances or energy by man, directly or indirectly, 
which result in such deleterious effects as harm to living resources, hazards to 
human health, hindrance to aquatic activities including fishing, impairment of 
water quality with respect to its use in agricultural, industrial and often 
economic activities (Meybeck and Helmer, 1992), and reduction of amenities 
(GESAMP, 1988). The term pollution refers to changes caused by humans and 
their actions that result in water-quality conditions that negatively impact the 
integrity of the water for beneficial purposes, including natural ecosystem 
integrity (Johnson, 2009). Determining the extent of pollution is difficult, 
given the wide range of constituent measures that characterize water quality 
(e.g., dissolved and suspended solids, organics, bacteria, toxics, and metals). 

Evaluation of water quality, assessment of spatial and temporal variations 
and its vulnerability mapping are among the important tasks in order to 
manage quality of the useful water resources. 

There are many tools and techniques for evaluating the water quality and 
geographic information system is one of them, which is gaining a wide 
popularity nowadays because of several advantages of the technique. 

Geographic Information System (GIS) has emerged as a powerful tool for 
capturing, storing, analyzing, manipulating, retrieving and displaying spatial 
data and using these data for decision making in several areas including 
engineering and environmental fields (e.g., Stafford, 1991; Goodchild et al., 
1993; Burrough and McDonnell, 1998; Lo and Yeung, 2003). It allows for 
swift organization, quantification and interpretation of a large volume of 
spatial data with a computer accuracy and minimal risk of human errors. 
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GIS is an effective tool for analyzing spatial and temporal data of water 
quality (Burrough and McDonnell, 1998; Gurnell and Montgomery, 2000; 
Chang, 2002; Chen et al., 2004). Information on spatial and temporal 
variability/trends of water quality is very helpful in the decision-making 
process (Freeze and Cherry, 1979; Todd, 1980; Fetter, 1994). In addition, 
water quality mapping is essential for monitoring, pollution hazard 
assessment, modeling and environmental change detection (Goodchild et al., 
1993; Skidmore et al., 1997; Chen et al., 2004; Jha et al., 2007). In a GIS 
framework, point estimates of water quality parameters can be spatially 
interpolated by spatial interpolation techniques such as kriging, inverse 
distance weighting, etc. to develop parameter concentration maps at different 
time scales or other related maps. GIS presents spatial information in the form 
of maps where different features are located by symbols, and is integrated with 
databases containing multiple attributes’ data of the mapped features. A map 
helps providing knowledge of where and what things are, and how they are 
related. The GIS database containing spatial and point attributes can then be 
used to generate interactive reports and maps, which in-turn can support 
decision-making about the best design alternatives and their impacts. 
Furthermore, GIS-based maps serve as powerful communication medium in 
presenting information in such a way that the people involved in the planning 
and management of water quality can better understand and get more involved. 

This chapter deals with various methods, i.e., statistical properties, 
vulnerability mapping, water quality indices, etc. for water quality evaluation 
using integration of the GIS technique. In all the methods, central role of GIS 
technique in water quality evaluation is highlighted. 


2. POINT AND NON-POINT SOURCES OF POLLUTION 


GIS plays a central role in water quality management practice and 
augments efforts to monitor water quality changes in surface waterbodies or 
aquifers, to calculate pollutant concentrations and loads to a surface waterbody 
or groundwater, and model water quality of aquatic systems (Johnson, 2009). 
Water quality protection and management require quantity of the waste- 
assimilative capacity of receiving waters to be known, which is determined 
using the concept of ‘total mass daily loading’ (TMDL). A TMDL is assessed 
taking account of all sources of a pollutant, from both point and nonpoint 
sources, and the waste assimilative capacity of the receiving water body 
(USEPA, 1991). 
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Water quality evaluations require a broad range of environmental and 
administrative data and one of the major categories of data include pollutant 
sources. Pollution may result from point sources or non-point sources (diffuse 
sources). Point sources are clearly identified at a single or multiple locations 
such as wastewater flow in conduits from municipalities and industries. 
However, nonpoint sources are diffuse and may not be defined by certain point 
locations for pollution such as urban runoff, erosion from agricultural and 
deforested lands. In other words, non-point sources include everything else 
that is not a point source. Sometimes, it is difficult to distinguish between both 
the point and non-point sources of pollution because a diffuse source on a 
regional or local scale may result from a large number of individual point 
sources, such as automobile exhausts. 

An important difference between a point and a diffuse source is that a 
point source is amenable to control through collection and treatment processes 
while a non-point source is difficult to control with engineered facilities, e.g. 
collection and treatment, because of diffuse character of this source. 

A diffuse pollution source consisting of several point sources may also be 
controlled provided all point sources can exactly be identified. Most common 
point and non-point sources of pollution are listed in Table 1. 


3. WATER QUALITY EVALUATION 


Water quality evaluation is an overall process of evaluating physical, 
chemical and biological nature of water in relation to natural quality, human 
effects and intended uses particularly uses which may affect human health and 
the health of the aquatic system itself (Bartram and Ballance, 1996). Water 
quality evaluation includes the use of monitoring data to define the condition 
of water, to provide a basis for detecting trends and to provide information 
enabling the establishment of cause-effect relationships. Thus, important 
aspects of water quality assessment are: interpretation of water quality data, 
reporting of results, and recommendations for future actions. Three important 
components of water quality evaluation in a logical sequence are monitoring, 
followed by assessment, followed by management (Meybeck et al., 1992). 

The process of water quality evaluation involves many complex 
operations, which are linked together forming a chain of about twelve links 
where every link is important as its failure will weaken the entire evaluation. 

Elements of various water quality evaluation programmes may differ 
depending upon the objectives of the programme. 
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However, there are certain standard elements, which are common to 
almost all type of water quality evaluation programmes. A generalized 
structure of water quality evaluation programme consisting of twelve elements 
is shown in Figure 1. 

Prior to designing a water quality evaluation programme, clear-cut 
objectives should be set on the basis of environmental conditions (pollution 
sources), water uses (present and future), and water legislation. 

Once the programme objectives are set, monitoring design is determined 
based on review of existing water quality data, which is supported by 
preliminary survey. In next step, various monitoring operations are performed 
to collect water samples from selected sites in the field, and then, the collected 
samples are analysed in laboratory. 


Table 1. Summary of point and non-point sources of pollution 


Point Sources 
1. Municipal and industrial 
wastewater effluents 


Non-Point Sources 
1. Return flow from irrigated agriculture 
and orchards 


2. Runoff and leachate from solid- 
waste disposal sites 


2. Runoff from crops, pasture, and 
rangelands 


3. Runoff and drainage from animal 
feedlots 


4. Runoff from industrial sites 


3. Runoff from logging operations, 
including logging roads and all-terrain 
vehicles 

4. Urban runoff from small communities 
and unsewered settlements 


5. Storm sewer outfalls from urban 
centers 


5. Drainage from failing septic tank 
systems 


6. Combined sewer overflows and 
treatment plant bypasses 


6. Wet and dry atmospheric fall-out or 
deposition over waterbodies (e.g., acid 
rain) 


7. Mine drainage and runoff (also 
oil fields) 


7. Flow from abandoned mines and 
mining roads 


8. Discharges from storage tanks, 
chemical waste piles, and ships 


8. Runoff and snowmelt from roads 
outside urban areas 


9. Runoff from construction sites 


9. Wetland drainage 


10. Airport snowmelt and runoff 
from deicing operations 


10. Mass outdoor recreation and 
gatherings 


11. Military training, manoeuvres, 
shooting ranges 


After Johnson, 2009. 
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The last step, which is important but often underestimated or omitted in a 
program, is synthesis, compilation, presentation and interpretation of 
enormous chemical data in a convenient manner for visual inspection (Freeze 
and Cherry, 1979; Sara and Gibbons, 1991). On completion of the programme, 
recommendations should be communicated to relevant water authorities for 
water management, water pollution control, and eventually the adjustment or 
modification of monitoring activities. 


4. TOOLS FOR WATER QUALITY ANALYSIS 


Several conventional tools for the graphical analysis of water quality are 
described in standard textbooks on groundwater hydrology or hydrogeology 
(Freeze and Cherry, 1979; Karanth, 1987; Sara and Gibbons, 1991). Recently, 
the need for application of modern approaches and tools such as multivariate 
statistical techniques (e.g., principal component analysis, hierarchical cluster 
analysis, discriminant analysis and correspondence analysis), and remote 
sensing and GIS techniques have been emphasized for the efficient analysis of 
water quality (e.g., Jha et al., 2007; Steube et al., 2009). The state-of-the-art 
review of tools and techniques for the interpretation of water quality can be 
found in Machiwal and Jha (2010) wherein available tools and techniques 
(conventional as well as modern) for analyzing water quality are classified into 
four major groups: (i) graphical, (ii) statistical, (iii) remote sensing (RS), 
geographic information system (GIS) and geostatistical, and (iv) modelling 
techniques. 


5. GIS-BASED ASSESSMENT OF 
WATER QUALITY VARIABILITY 


Extensive literature search made by the authors of this chapter revealed 
that most studies dealing with GIS applications for evaluating water quality 
are focused on subsurface water compared to that on surface water. 

This is most likely due to the relatively easy availability of large number 
of point groundwater samples through wells (hand pump, open well, tubewell, 
etc.). 

However, sampling of surface waterbodies requires some mechanism 
(e.g., boat) to reach different points in the waterbody. 
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Figure 1. Generalized framework of water quality evaluation program showing 
standard aspects. 


Modified from Meybeck and Helmer, 1992. 
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Another possible factor for the vast GIS application in groundwater 
quality studies is the significant variations in the water quality over a short 
distance within an aquifer. This is, in general, not the case of surface 
waterbody where water at point is free to move and mix with water at other 
points and this causes relatively less spatial variability especially in stagnant 
water of small ponds and reservoirs. Also, the water stored by the surface 
waterbodies is mostly the rainwater containing less concentration of the major 
ions as the water on surface has the least chances to come across different 
geological terrains comprising of certain minerals and other substances. On the 
other side, groundwater passing and moving through the subsurface formation 
meets different kind of salts, minerals, etc. which are easily dissolved with the 
flowing water. Thus, chances of increased concentration of the major ions and 
other metal contents are relatively higher for the groundwater as compared to 
surface water. 

This, perhaps, may be one of the causes that groundwater quality studies 
cite large application of the GIS techniques. 

Spatial and temporal variability of the water quality is one of main 
features of different types of surface and subsurface waterbodies. Water 
quality variations over space and time are largely determined by 
hydrodynamic characteristics of the waterbody. Water quality of a waterbody 
varies over a space in all three dimensions, which are further altered by flow 
direction, discharge and time (Meybeck and Helmer, 1992). Thus, one location 
measurements in a waterbody may not be appropriately represents the water 
quality of entire waterbody. Instead, one network or grid of sampling sites 
would be needed to present spatial variations of the water quality. Generally, 
one-dimensional samples are collected on a longitudinal profile in case of river 
and on a vertical profile in case of pond/reservoir/lake as illustrated in Figures 
2 (a,b). Two-dimensional profile sampling is appropriate for observing plumes 
of pollution from a source and this is most-suitable for groundwater quality of 
aquifers (Figure 2c). Temporal variability of chemical water quality can be 
defined into five categories based on time scale as listed in Table 2. 


5.1. Characterizing Extent and Patterns of Contamination 


In water quality studies, activities begin with field data collection where it 
is a common practice to obtain the data from multiple locations and sources. 

All the collected data need to be collated and converted into common 
format of the water quality database. 
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GIS provides excellent and powerful functions to capture and collate the 
water quality data. It is also seen that water quality studies involving 
repetitive, archival and historic use of the data requires the data be stored in a 
formal database that can be used for exploratory purposes. Water quality 
database to be utilized for GIS applications requires spatial coordinates, i.e. 
latitude and longitude (or x- and y- coordinates) to be attached with the data. 
The water quality data is further characterized by the depth at which the 
sample is taken (vertical z-coordinate). 

Monitored data must also be characterized with regard to time t at which 
sample is taken. Thus, concentration (c) of any physical, chemical and 
biological parameter can be defined by the following function: 


c=f(x, y,z,t) (1) 


In surface waterbodies such as rivers where discharge (Q) is a significant 
quantity, the flux determination and data interpretation also require knowledge 
of water discharge, and thus the concentration should also be a function of Q 
as shown below. 


c=f(x,y,z,t,Q) (2) 


Firstly, the sampling locations based on their spatial coordinates are 
located within GIS environment. Then, spatial locations of the sampling points 
are attached with related attribute tables where different attributes of all 
individual sites/points are stored such as concentration of all major ions, 
calcium, magnesium, chloride, carbonate, etc. for one spatial point is stored. 

Finally, the concentrations of a water quality parameter can be displayed 
over the entire space through spatial interpolation. There has been evolved a 
lot of spatial interpolation techniques over the time. An overview of the spatial 
interpolation techniques is provided in subsequent section. 


5.2. Overview of Spatial Interpolation Techniques 


In numerical analysis, spatial interpolation or multivariate interpolation is 
interpolation on multivariable functions. The spatial interpolation consists of 
interpolating the multivariable function, known at given points, to yield values 
at arbitrary points. 
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Modified from Meybeck and Helmer, 1992. 


Figure 2. Sampling strategies for exploring spatial variations of water quality in (a) 
river, (b) reservoir and (c) groundwater aquifer. 


Table 2. Scale of temporal water quality variability and causing factors 


Scale of Temporal Variability Causing Factor 

water mixing, fluctuations in 
Minute-to-minute to day-to-day | inputs, etc., mostly linked to meteorological 
variability conditions and water body size (e.g. 
variations during river floods) 

biological cycles, light/dark cycles etc. (e.g. 
02, nutrients, pH), and to cycles in 
pollution inputs (e.g. domestic wastes). 
climatic factors (river regime, lake 
overturn, etc.) and to pollution sources (e.g. 
industrial wastewaters, run-off from 
agricultural land). 


Dual variability (24-hour 
variations) 


Days-to-months variability 


seasonal hydrological and mostly in connection with climatic 
biological cycles factors 
Year-to-year trends human influences 


After Bartram and Ballance, 1996. 


Most hydrogeologic applications of spatial interpolation involve quantities 
that vary in space but the methods may also apply to quantities that vary in 
time (Kitanidis, 1999). 
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If function values are known on non-uniform grid, then available methods 
are nearest neighbor interpolation, natural neighbor, inverse distance 
weighting, kriging (one of the geostatistical techniques), and radial basis 
function (e.g., Gotway et al., 1996; Robinson and Metternicht, 2006; Namgial 
and Jha, 2009). Past studies dealing with GIS applications in water quality 
have mostly used geostatistical modeling and inverse distance weighting 
techniques for spatial interpolation. It is also revealed from the literature that 
geostatistical modeling tool was originally developed to deal with subsurface 
studies, and is widely-used for hydrogeologic studies. 


5.2.1. Overview of Geostatistical Modeling Technique 

Geostatistical modelling is a set of statistical estimation techniques 
involving quantities which vary in space (i.e., spatial variables). Geostatistical 
techniques for describing and interpolating spatially correlated data take 
advantage of the general observation that, on average, values closer together in 
space will be more similar than those farther from each other. The steps in 
applying these techniques include developing ‘theoretical semi-variogram 
models’ that describe the spatial variation between pairs of spatially or 
temporally related samples and then using these models to estimate sample 
parameters and their error variances at unknown locations. Although 
geostatistical modelling techniques were originally used in geological sciences 
(Journel and Huijbregts, 1978), they have also been frequently applied in 
hydrological, agricultural and ecological sciences to evaluate spatial 
dependence of surface/subsurface properties and ecological communities, or to 
interpolate these parameters (e.g., Goovaerts, 1999; Castrignano et al., 2000; 
Mouser et al., 2005; Schaefer and Mayor, 2007). The process of applying GIS 
and geostatistical modelling techniques for developing a spatial distribution 
map of a water quality variable is illustrated in Figure 3. 


(A) Spatial Estimation by Kriging Technique 

In geostatistics, if Z(x) represents any random function for concentration 
of any water quality variable measured at n locations in space z(x;), i= 1, 2, ... 
n and if the water quality of the function Z has to be estimated at the point Xo, 
which has not been measured, the kriging estimate is defined as (Journel and 
Hujibregts, 1978; Kitanidis, 1997): 


ADIE 4 
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where, Z*(x,) = estimation of function Z(x) at point xo and 4, = weighting 
factors that minimize the variance of the estimation error (ordinary kriging 
weights). 

Now two conditions are imposed to Equation (3), i.e., the unbiased 
condition and the condition of optimality. The unbiased condition means that 
the expected value of estimation error or the mean difference between the 


estimated z"(x,) and the true (unknown) 2(x,) value of the concentration of 


water quality variable should be zero. The condition of optimality means the 
variance of the estimation error should be minimum. 

The spatial structure defined by theoretical variogram, a kriging system of 
linear equations combining neighbouring information can be defined as 


24 C(x,.x,)- u= C(x;,x5) 


ji „i=1,2,...n (4) 


subjected to the constraint on weights: 


$ia 
i © 


where, //= Lagrangian multiplier and Clx,,x 1) = value of covariance 


between two points x; and xj. 

When we deal with an intrinsic case, i.e., working with variogram, the 
kriging Equation (4) and (5) are simply modified as follows (Marsily, 1986; 
Ahmed, 2006): 


C(x,.x,)=C(0)— r(x,.x,) (6) 
C(x; .X 9) =C(0)- 7x, x5) (7) 


Eqnuations (6) and (7) hold good only when both the covariance and the 
variogram exist, i.e., variables are stationary. 


(B) Geostatistical or Variogram Models 
Experimental geostatistical or variogram model is the function of 
separation vector between two points i and j. 
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Figure 3. Flowchart showing step-by-step methodology for applying GIS and 
geostatistical techniques for generating maps of water quality data. 
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Geostatistics 


The values of separation vectors, e.g., hı, hz etc. are decided first such that 


h= 


Xx, -x| 
t J 


(8) 


Depending upon the value of h, the data are grouped into pairs and some 


function as defined below is averaged to obtain a variogram ( Y;j ) (Goovaerts, 
1997): 


y(h) = -l Sga) —2(x, + h)” 
2N, a (0) 


where, N, = number of pairs for a given lag distance h. 
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A theoretical geostatistical or variogram model (Figure 4) can be defined 
essentially by ‘sill’? and ‘range’. ‘Sill’ is the constant value on the y-axis 
around which a variogram stabilizes after a large distance and ‘range’ is the 
value at x-axis at which the variogram becomes constant or nearly constant. 
The sill value is usually very close to the variance of the variable (Matheron, 
1965; Ahmed, 2006). In addition, the sudden apparent jump near the origin 
that occurs in some cases is known as ‘nugget’ effect. The shape of the 
variogram between origin and the point of stabilization is different for 
different variables, which entirely depends on its nature of variability 
(Matheron, 1965). In order to understand spatial structure, experimental water 
quality data are classified into lag distances with approximately the same 
number of data and semi-variogram values are calculated for each class 
(denoted by individual points shown in Figure 4) using geostatistics or GIS 
software packages such as MathWorks, GSLib, GSTAT, GeoPack, ILWIS, 
ArcGIS, IDRISI, etc. 


(C) Fitting of Theoretical and Experimental Variograms 


The experimental variogram calculated from the observed water quality 
data using Equation (9) is usually an erratic curve (Kitanidis, 1997, 1999). 
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Figure 4. Fitting of the theoretical variogram to an experimental variogram. 
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It is not possible to use this experimental variogram in the estimation 
purpose due to its inconsistent nature. Therefore, the curve of the experimental 
variogram is approximated by another theoretical curve with a defined 
mathematical expression. This smooth curve fitted to the experimental 
variogram is known as ‘theoretical variogram’ as shown in Figure 4. 

This fitting or modeling is performed in several ways mostly visual or 
using some form of difference between the two variograms but on a trial and 
error basis. Sometimes an automatic modeling is proposed but is not proved to 
be very useful. The commonly used variogram models are: spherical, circular, 
Gaussian, and exponential (Issaks and Srivastava, 1989; Kitanidis, 1997). The 
mathematical expressions for these theoretical variogram models are given 
below. 


(i) Spherical Model: 


3 
nhncyec{-) 
y á ,for0<h<a (10) 
y(h)=Cy Te es (11) 


(ii) Circular Model: 


nhy=c, +e] 1-2 Pere (ey E 1-(h/ay 
a ae for 0<h<a_ (12) 


ARCEO ee (13) 


(iii) Gaussian Model: 


(h)=C,+C le" | ae 


(iv) Exponential Model: 
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y(h)=C,+C h-e”) as 


where, Co+C is the sill, a is the range, and h is the separation vector or lag 
distance. 


(D) Selection of the Best-Fit Model 

Once the fitting of experimental and theoretical variograms is over, the 
best-fit geostatistical model can be selected based on a set of goodness-of-fit 
criteria viz., mean error (ME), root mean squared error (RMSE), correlation 
coefficient (r), mean standard error (MSE), mean reduced error (MRE), 


i 2 : ae z 
reduced variance (Sè ), and coefficient of determination (r°). The details 


about these goodness-of-fit criteria can be found in Table 3. 


5.2.2. Inverse Distance Weighting Technique 

Inverse distance weighting (IDW) technique is one of the moving average 
methods for spatial interpolation. Moving average method performs a 
weighted averaging on point values and returns a spatial map as output 
based on a specified weight function and a limiting distance (Webster and 
Oliver, 2001). While applying the IDW technique for interpolating values of 
any water quality variable for an output point, the distances of all points 
(where the water quality parameter is known) towards the output point are 
calculated to determine weight factors for the points. The weight factors for 
the points are then calculated according to the specified weight function. 

Two weight functions are available (Burrough and McDonnell, 1998): 
inverse distance and linear decrease. Weight for the inverse distance function 
is expressed below: 


Weight =(1/d")—1 (16) 


where, d = D/Dp = relative distance of a known water quality point to output 
point, D = Euclidean distance of known water quality point to output point, Do 
= limiting distance, and n = weight exponent. 

The weights vary according to the relative distance of any known water 
quality point to output point and the weight exponent (Figure 5). 

Thereafter, for each output pixel, value of particular water quality variable 
is calculated as the sum of the products of calculated weight values and point 
values divided by the sum of weights. That is, 
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WQ => (w, xv SW, 
i=l isl (17) 


where, WQ = value of concerned water quality variable, w; = weight value for 
the i” point, v; = point value of the j” point, and p = total number of points 


within the limiting distance. 


Table 3. Summary of the goodness-of-fit criteria 


S. No. Gopdness oiii Equation 
Criteria 


n 


ME=1 > [zx,) -z*(x,)] 


Mean Error n 4 
1 i=l 
(ME) Where, z(x;) and z*(x;) = observed and estimated values of 
variable z at the location xi, and n = number of data points. 
Root Mean 7 x 2 
Ap E r SS 
2 Squared Error 2il (x;) (x; )] 
(RMSE) RMSE = z 
Correlation n 
Coefficient (r) nd [e(x,)-2x,)]- b AX;)- Sax, | 


3 (Rodgers and 


a i Hire] aD} Hj {Skew }- iza, )} j 


1 
Mean Standard MSE =— Ok (x, ) 
Error (MSE) a 


Where, Ok (x i) = estimation variance at the location Xi. 


Mean Reduced 
Error (MRE) 
(Vauclin et al., 
1983) 
Reduced 


MRE= } >> [2(x,) —2*(x,)]/o, («,) 
n 


i=l 


Variance ( S q le 2 
6 Si. == Dle- aoa] 
(Vauclin et al., + 
1983) 
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Goodness-of-fit 


oe Criteria Equation 
r=l- SSE ; Where, 
(SSR + SSE ) 
Coefficient of y 
Determination SSE = 5 [z(x; )-z*(x; Ni 
7 ^) i=l 


(Draper and n 
Smith, 1998) and SSR + SSE = >"[z(x,)—Z(x,)P 


i=l 


Where, Z(X; ) = mean of Z(X; Je 
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Figure 5. Inverse distance weights for relative distance of point to output point. 


5.3. Statistical Measures in Spatial Context 


Availability of multiple observations of water quality attributes both at 
spatial and temporal scales provides an opportunity for exploring spatial and 
temporal variations of the water quality in an area. Spatial statistics of the 
water quality database (mean, median, standard deviation, coefficient of 
variation, etc.) can be easily computed in GIS. 

For example, if multi-year and multi-site water quality parameters are 
available for a given area. Then annual concentration maps for individual 
parameters and years can be prepared through spatial interpolation techniques 
(described earlier) in GIS framework. Subsequently, mean (Cmean) annual 
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concentration map for any of the water quality parameter can be created by 
using following equations (Machiwal et al., 2011): 


Ci 
mean,i N (18) 


where, Crean,i = Mean annual concentration map of i water quality parameter, 
Cai = annual concentration map of the i™ parameter in n” year, and N = total 
number of years of data availability. 

In order to compute the median, first rank the annual observations from 


the smallest (CA observation to the largest (Ga ) observation and then use 


one of the following equations depending on the number of observations (N): 


Cae =C, 
mediani ~~ ~(N+D/2 | when N is odd (19) 
ly. i 
C median i T one + Ces} ' 
2 , when N is even (20) 


The spatial standard deviation map can be prepared from the following 
expression (Machiwal et al., 2011): 


N 


Sic, 20.54) 


E n=l 
Coi i 


N-I1 (21) 


where, Csa; = spatially-distributed standard deviation map of the G parameter. 
Thereafter, the coefficient of variation maps for different parameters can 
be developed using the following equation (Machiwal et al., 2011): 


mean ,i (22) 


Hydrologic variables with larger CV values are more variable than those 
with smaller values. Wilding (1985) suggested a classification scheme for 
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identifying the extent of variability for soil properties based on their CV 
values, where CV values of 0-15, 16-35 and >36 indicate little, moderate and 
high variability, respectively. 

Typical ranges of CV values of salient soil properties are reported in the 
literature (Jury, 1986; Jury et al., 1987; Beven et al., 1993; Wollenhaupt et al., 
1997). 


6. GIS FRAMEWORK FOR GROUNDWATER 
VULNERABILITY MAPPING 


6.1. Groundwater Vulnerability Concept 


The groundwater vulnerability concept, evolved during end of the 1960s 
in France, aimed at creating awareness of groundwater contamination (Margat, 
1968; Albinet and Margat, 1970). The vulnerability concept in hydrogeology 
began to be widely used in the 1980s (Haertle, 1983; Aller et al., 1987). It was 
defined as the possibility of percolation and diffusion of contaminants from 
the ground surface into the groundwater system. 

Groundwater vulnerability deals only with the hydrogeological setting and 
does not include pollutant attenuation. Initially, the term ‘vulnerability’ was 
meant as relative susceptibility of aquifers to anthropogenic pollution without 
any formal definition. Later on, the concept began to mean different things to 
different people. Margat (1968) used the term ‘vulnerability’ to mean the 
degree of protection that the natural environment provides against the ingress 
of pollutants to groundwater. Thereafter, several definitions of vulnerability 
have been proposed. Foster (1987) defined aquifer pollution vulnerability as 
the intrinsic character of the strata separating the saturated aquifer from the 
immediately overlying land surface which determines its sensitivity to being 
adversely affected by a surface applied (anthropogenic) contaminated load. 
National Research Council (1993) defined groundwater ‘vulnerability’ to 
contamination as the tendency or likelihood for contaminants to reach a 
specified position in the groundwater system after introduction at some 
location above the uppermost aquifer. Vrba and Zaporocec (1994) defined 
‘vulnerability’ as an intrinsic property of groundwater, depending on its 
susceptibility to natural and/or human impact. The groundwater vulnerability 
is a specific characteristic of the underlying groundwater system and cannot be 
practically measured in the field. 
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In general, the status of groundwater contamination is determined by the 
natural attenuation processes occurring within the zone between the pollution 
source and the aquifer. Mainly two natural factors, i.e. physical processes and 
chemical reactions occurring within the soil, unsaturated zone and saturated 
zone are responsible for alteration in physical states and chemical forms of 
contaminants, which ultimately leads to attenuation of contaminants. There 
may be a single or multiple chemical reactions to work with other processes 
resulting in a varying degree of attenuation. 

These reactions depend on the specific soil and aquifer characteristics and 
particular geochemical properties of each contaminant. Thus, groundwater 
vulnerability is a function of geology and hydrogeology of the unsaturated and 
saturated zones and physico-chemical properties of the contaminants. 

All factors affecting groundwater vulnerability may vary from one place 
to another. The groundwater vulnerability may be classified in two ways: 
intrinsic vulnerability and specific vulnerability. The term ‘intrinsic 
vulnerability’ refers to the vulnerability of groundwater to contaminants 
generated by anthropogenic or human activities taking into account the 
inherent geological, hydrological and hydrogeological characteristics of an 
area but being independent of the nature of the contaminants. On the other 
side, term ‘specific vulnerability’ is used to define the vulnerability of 
groundwater to particular contaminants or a group of contaminants taking into 
account the contaminant properties and their relationship with the various 
components of intrinsic vulnerability (Doerfliger et al., 1999; Gogu and 
Dassargues, 2000). 


6.2. GIS-Based Methods to Evaluate Groundwater Vulnerability 


Geographic Information System (GIS) technique has fundamentally 
changed our thoughts and ways to manage natural resources in general and 
water resources in particular (Jha et al., 2007). 

GIS is designed to collect diverse spatial data to represent spatially 
variable phenomena by applying a series of overlay analysis of data layers that 
are in spatial register (Bonham-Carter, 1996). Vulnerability assessment is a 
basis for initiating protective measures for important groundwater resources 
and will normally be the first step in groundwater pollution assessment (Foster 
et al., 2002). The GIS technique is of great significance in assessing the 
pollution vulnerability of the aquifers over a large area. 
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Many approach such as process-based methods, statistical methods, and 
overlay and index methods have been developed to evaluate aquifer 
vulnerability (Tesoriero et al., 1998). Variability of land vulnerability to 
groundwater contamination leads to mapping of groundwater vulnerability 
(Piscopo, 2001). In the process-based methods, simulation models are used to 
estimate the movement of contaminant in groundwater. 

The major drawback in using process-based methods is data shortage and 
computational difficulties (Barbash and Resek, 1996). 

In statistical methods, statistical terms are used to determine relations 
between spatial variables and actual occurrence of pollutants in the 
groundwater. 

Their major limitations include absence of sufficient water quality 
observations, data accuracy, and careful selection of spatial variables (Babiker 
et al., 2004). Overlay and index methods resulting in vulnerability indices 
mainly depend upon factors, which control the pollutant movement from the 
ground surface into the saturated zone. Their main advantage is that 
vulnerability assessments can be made at regional scale as some of the factors 
such as rainfall, soil type and groundwater depth are easily available over large 
areas, which makes them suitable to be used with geographic information 
system (Thapinta and Hudak, 2003). 

In general, overlay and index methods and statistical methods are used for 
contamination assessments at map scales smaller than 1:50,000 (i.e., a large 
study area), while process-based simulation models are at larger map scales 
(i.e., a small study area) (Rao and Alley, 1993). Overlay and index methods 
and statistical methods are used to assess intrinsic vulnerability, while methods 
based on simulation models are used to assess specific vulnerability. 


6.3. Groundwater Vulnerability Mapping by GIS-Driven 
Overlay and Index Methods 


The most common approach to quantify aquifer vulnerability at present is 
the overlay and index method, whereby the protective effect of the overlying 
layers is expressed in a semi-quantitative way (Frind et al., 2006). Overlay and 
index methods efficiently determine groundwater vulnerability. These methods 
deal with overlaying and aggregation of multiple spatial maps and these spatial 
analyses of a group of maps can easily be performed in geographic 
information system. 
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Thus, the overlay and index methods are particularly suitable for use with 
geographic information systems (Tilahun and Merkel, 2010). An overlay and 
index method, being a multicriteria model, aggregates different hydrological/ 
hydrogeological factors that control the movement of pollutants from the 
ground surface to underlying aquifer. 

A GIS-based overlay and index method combines factors controlling 
pollutant migration according to certain multi-criterion rule and computes 
resulting value of vulnerability index for different spatial locations. 

A general methodology for applying groundwater vulnerability methods in 
GIS framework is shown in Figure 6. 


Geological and 


l Existing Maps Remote Sensing Data Hydrogeological Data 


Assigning Relative Weights to Different Thematic Layers 


Groundwater 


Vulnerability Map 


Figure 6. General GIS-based methodology for groundwater vulnerability study. 
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Before 1980s, there had been several attempts to formulate and establish a 
methodology to assess the vulnerability in order to present it in a map. 
However, the successful results could be obtained during the mid 1980s when 
two of the pioneer indices called DRASTIC (Aller et al., 1987) and GOD 
(Foster, 1987) were reported. 

There are many kinds of vulnerability identified by different methods 
associated with a wide range of index values and labelled qualitatively. The 
categorization of vulnerability into different classes depends upon the index 
values and appropriate number of categories decided by a person. 

The groundwater vulnerability assessment has rapidly developed over the 
past 20 years; many new tools and techniques are introduced for the 
groundwater vulnerability assessment along with specific applications being 
thoroughly analyzed and tested for different environments (Cramer and Vrba, 
1987; Meinardi et al., 1995; Secunda et al., 1998; Lasserre et al., 1999; Al- 
Adamat et al., 2003; Lake et al., 2003; Rodriguez et al., 2003; Thapinta and 
Hudak, 2003; GEAM, 2005; Allen and Milenic, 2007; Zhou et al., 2010). 

Moreover, many studies have used different scales and sources of 
information for the application of these techniques (Secunda et al., 1998; 
Foster et al., 2002; Civita and De Maio, 2004; Wang et al., 2007). 

The widely-used models of index methods include DRASTIC (Aller et al., 
1987), GOD (Foster, 1987), AVI rating system (Stempvoort et al., 1993), 
SINTACS (Gogu and Dassargues, 2000) and EPIK (Doerfliger et al., 1999). 

Conventional methods, e.g., DRASTIC, GOD, AVI, SINTACS, etc. do 
not take into account the peculiar features of karstic (or carbonate) geological 
formations. Thus, to address pollution vulnerability assessment in karstic 
aquifers, few specific methods, e.g., EPIK (Doerfliger and Zwahlen, 1998; 
Doerfliger et al., 1999), PI (Goldscheider et al., 2000) and COP (Vias et al., 
2006) have been developed. Available methods (conventional as well as non- 
conventional) for groundwater vulnerability mapping can be classified into 
two groups as shown in Table 4. 

Three major limitations of overlay and index methods are: (i) defining 
groundwater vulnerability in qualitative terms, which is opposed by 
quantitative terms (Gogu et al., 2003; Frind et al., 2006; Popescu et al., 2008), 
(ii) finding it difficult to quantify exact amount of uncertainty involved in 
vulnerability assessments in order to handle inaccuracies incurred in analysis 
(Gogu and Dassargues, 2000), and (iii) strong homogeneous results observed 
over large areas in many parts of the world, which restricts for discrimination 
and delimitation of areas of different vulnerability to pollution. 
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These problems are addressed by the study reported by Massone et al. 
(2010), where different units with different categories of vulnerability in 
geological homogenous environments are discriminated. 

Also, use of qualitative adjectives such as ‘low’ or ‘moderate’ is avoided 
because of their subjective meaning. 


6.3.1. DRASTIC Method 

DRASTIC is one of the most widely used standard groundwater 
vulnerability methods, which was developed by the United States 
Environmental Protection Agency (USEPA) as a method for assessing 
groundwater pollution potential (Aller et al., 1987). Seven most important 
mappable factors that control groundwater pollution were determined after a 
complete evaluation of many characteristics and the mappability of the data; 
the parameters are as follows: 


Table 4. Conventional and modern methods for groundwater 
vulnerability mapping 


S. No. Method Parameters Source 

Methods for Porous Aquifers 

D — Depth to water 

R — (net) Recharge 

DRASTIC A- Aquifer media 

1 and Pesticide S — Soil media Aller et al. 
T — Topography (slope) (1987) 

DRASTIC 

I — Impact of vadose zone 

C - (hydraulic) Conductivity of the 

aquifer 

D — Depth to water 

R — (net) Recharge 

A — Aquifer media Wang et al. 

M — Aquifer thickness (2007) 

I — Impact of vadose zone 

C — impact of Contaminant 

G — Groundwater occurrence including 

recharge 
g ee O — Overlying lithology Foster (287) 
D — Depth to groundwater 


2 DRAMIC 


Stempvoort et 


4 AVI c — Hydraulic resistance al. (1992) 
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S. No. 


Method 


Parameters 


Source 


Methods for Karstic (Carbonate) Aquifers 


SINTACS 


S — Depth to groundwater 
I — Recharge action 


N — Attenuation potential of the vadose 


zone 
T — Attenuation potential of the soil 
A — Hydrogeologic characteristics of 
the aquifer 

C — Hydraulic conductivity 

S — Topographic slope 


Civita (1994) 


EPIK 


E — development of Epikarst 

P — Protective cover 

I — Infiltration condition 

K — Karst network development 


Doerfliger and 
Zwahlen (1995) 


GLA 


S — effective field capacity of the soil 
(rating for FCe in mm down to 1 m 
depth) 

W — percolation rate 

R — rock type 

T — thickness of soil and rock cover 
above the aquifer 

Q — bonus points for perched aquifer 
systems 

HP — bonus points for hydraulic 
pressure conditions (artesian 
conditions) 


Hoelting et al. 
(1995) 


PI 


P — Protective cover 
I — Infiltration conditions 


Goldscheider et 
al. (2000) 


9 


COP 


C — flow Concentration 
O — Overlying layers 
P — Precipitation 


Daly et al. 
(2002) 


D — Depth to Water, R — (Net) Recharge, A — Aquifer Media, S — Soil Media, T — 
Topography (Slope), I — Impact of Vadose Zone, C — (Hydraulic) Conductivity of 
the Aquifer. 


These seven parameters are briefly described in Table 5. The DRASTIC 
index model can be used to identify areas that are more vulnerable to 
contamination than others, or to give priorities to areas that need more 
groundwater quality monitoring. It is a vulnerability index model designed to 


calculate vulnerability scores (numerical values) for different locations by 


combining seven thematic layers/factors. 

Before combining the factors, ratings and weights are assigned to the 
seven model parameters. The classes or features of each parameter represent 
the ranges, which are rated on the 1-10 scale based on their relative effect on 
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the groundwater vulnerability; a rating of 10 indicating a high pollution 
potential of the parameter. 

Once the ratings are assigned to all classes of the parameters, the weights 
ranging from one to five reflecting their relative importance with respect to 
each other are assigned to seven parameters (Table 5). 

The DRASTIC Index is then computed applying a weighted linear 
combination of all seven parameters by multiplying each parameter rating with 
its weight and adding together the resulting values according to the following 
equation (Aller et al., 1987): 


DRASTIC aex =DpDy +RaRy +AgAw +SkSw + TyTy +IgkIw + Crew (23) 


where D, R, A, S, T, I, and C are the seven parameters expressed above and 
the subscripts R and W are the corresponding ratings and weights, 
respectively. 

DRASTIC provides two weight classifications (Table 5), one for general 
conditions and the other one for conditions with intense agricultural activity. 
The latter, called the Pesticide DRASTIC index (DRASTIC-P), represents a 
specific vulnerability assessment approach. The DRASTIC-P method is the 
most suitable in agricultural areas mainly due to the greater weight given to 
the variables of soil and slope types (Massone et al., 2007). 

In recent years, the originally developed DRASTIC method has been 
modified by using additional parameters or factors and/or by ignoring the 
existing unimportant parameters according to the local characteristics of the 
study area (Fritch et al., 2000; Al-Adamat et al., 2003; Lee, 2003; 
Thirumalaivasan et al., 2003; Babiker et al., 2005; Simsek et al., 2006; Guo et 
al., 2007; Wang et al., 2007; Umar et al., 2009; Martinez-Bastida et al., 2010; 
Awawdeh and Jaradat, 2010). This reflects flexibility of the DRASTIC model 
to modify according to need of the study. The well-established DRASTIC 
method has been applied in different parts of the world such as the United 
States (e.g., Rupert, 2001; Merchant, 1994; Loague and Corwin, 1998; Wade 
et al., 1998; Stark et al., 1999; Fritch et al., 2000), Canada (Murat et al., 2004), 
Europe (e.g. Stigter et al., 2006; Vias et al., 2005), South America (Tovar and 
Rodriguez, 2004; Herlinger and Viero, 2006), Australia (Piscopo, 2001), New 
Zealand (McLay et al., 2001), Asia (Al-Adamat et al., 2003; El-Naqa, 2004; 
Thirumalaivasan et al., 2003; Rahman, 2008; Kimand Hamm, 1999), and 
Africa (Lynch et al., 1997; Ibe et al., 2001). 
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Table 5. Description of DRASTIC and Pesticide DRASTIC parameters 


Parameter 


Description 


Relative Weight 


DRASTIC | DRASTIC-P 


Depth to 
Water 


Represents the depth from the 
ground surface to the water table, 
deeper water table levels imply 
lesser chance for contamination 
to occur. 


Net Recharge 


Represents the amount of water 
which penetrates the ground 
surface and reaches the water 
table, recharge water represents 
the vehicle for transporting 
pollutants. 


Aquifer 
Media 


Refers to the saturated zone 
material properties, which 
controls the pollutant attenuation 
processes. 


Soil Media 


Represents the uppermost 
weathered portion of the 
unsaturated zone and controls the 
amount of recharge that can 
infiltrate downward. 


Topography 


Refers to the slope of the land 
surface, it dictates whether the 
runoff will remain on the surface 
to allow contaminant percolation 
to the saturated zone. 


Impact of 
Vadose Zone 


Is defined as the unsaturated zone 
material, it controls the passage 
and attenuation of the 
contaminated material to the 
saturated zone. 


Hydraulic 
Conductivity 


Indicates the ability of the aquifer 
to transmit water, hence 
determines the rate of flow of 
contaminant material within the 
groundwater system. 


After Aller et al., 1987. 
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The DRASTIC model is applicable in humid climates (Babiker et al., 2005; 
Piscopo, 2001; Kim and Hamm, 1999; and Osborn et al., 1998) as well as in 
semi-arid to arid climates (Werz and Hötzl, 2007; Al-Adamat et al., 2003; 
Secunda et al., 1998). In the original DRASTIC index model, semi- 
quantitative data layers were overlaid manually. However, the simple linear 
model of its combination factors expressing its vulnerability index shows the 
feasibility of employing the GIS for the computation of index (Fabbri and 
Napolitano, 1995). 

For past 15-20 years, the GIS technique has been widely used in 
groundwater vulnerability mapping (Evans and Myers, 1990; Loague et al., 
1996; Hrkal, 2001; Rupert, 2001; Lake et al., 2003; Massone et al., 2010; Yin, 
2013; Edet, 2014). The major advantage of GIS-based mapping is the best 
combination of data layers and rapid change in the data parameters used in 
vulnerability classification. Integration of DRASTIC method with GIS 
involves following four steps (Massone et al., 2010). 


(i) Preparation of thematic base maps (as a polygonal entity) for each 
parameter under consideration using GIS software packages. 
Subsequently, polygon map of each parameter is transformed into 
raster format using the spatial analysis functions of GIS. A suitable 
spatial cell resolution for spatial analysis can be chosen. 

(ii) Procedure indicated by methodology are applied for the assignment of 
weights and values to each layer of information and the application of 
map algebra to obtain the aquifer vulnerability maps, called 
DRASTIC and DRASTIC-P vulnerability maps. Conveniently, the 
DRASTIC index values can be discretized into suitable number of 
classes indicating very low, low, moderate, high and very high 
vulnerability, since this is the number of classes that allows one to 
recognize both the “best” values and the worst ones as two 
alternatives (high and very high or low and very low); this is better 
than recognizing only three classes where there is only one possible 
option towards each end (low or high). This is favourable to decision- 
making related to the use of soil in land-use planning, in 
environmental impact evaluations, etc. 

(iii) Reclassification of the DRASTIC vulnerability maps to obtain the 
DRASTICpriorities, which recognize five classes from priority 1 
(lower values in the series) to priority 5 (higher values). 

(iv) Combining the DRASTIC vulnerability map with the DRASTIC- 
priorities to generate an operational vulnerability index (OVI). For 
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this operation, both the vulnerability map and the DRASTIC-priorities 
are reclassified, assigning to each qualitative class a numerical value 
ranging from 1 (the lower class) to 5 (the higher one). 


6.3.2. DRAMIC (Modified DRASTIC) Method 

The DRASTIC method, originally developed for rural/agricultural lands, 
had some limitations and/or required modifications while applying in urban 
areas. 

First, it was observed that the parameter C (hydraulic conductivity) of the 
DRASTIC method is closely related to the parameter A (aquifer media). Thus, 
impact of aquifer media has two-fold effect. 

Second, topography of most cities in urban areas remains relatively flat 
(with negligible slope), and therefore, the parameter T (topography) can be 
ignored from DRASTIC method. Third, in urban areas, the ground surface is 
mostly covered by built-up structures, concrete, etc. and it is quite difficult to 
obtain comparable values of the parameter S (soil media). To overcome these 
problems, and to improve the predictability and applicability of the DRASTIC 
method, Wang et al. (2007) proposed DRAMIC method. The method is 
expressed in Equation (24), where parameters of DRAMIC method and their 
respective assigned weights are shown. Four parameters, i.e. D, R, A and I of 
the DRAMIC method are same as in DRASTIC method; parameter T is 
deleted and the parameters S and C are replaced with two new parameters, i.e. 
aquifer thickness (M) and impact of contaminant (C). DRAMIC index is 
described as (Wang et al., 2007): 


DRAMIC, ax =5Dp +3Rp +4Ak +2M, +5, +1C, 


ndex (24) 
where, D, R, A, and I are the same as in the DRASTIC method; M = aquifer 
thickness defined by media; C = parameter showing impact of contaminant; 
and R = rating. The computed DRAMIC index values can be used to delineate 
areas, which are more susceptible to groundwater contamination compared to 
other areas. The higher the value of DRAMIC index is, the greater the 
vulnerability to groundwater pollution. The hydrogeological significance, 
ranges and ratings for the four factors D, R, A, and I of the DRAMIC method 
are the same as in DRASTIC methods. 

The ranges and ratings for the two new parameters of the DRAMIC 
method, e.g. aquifer thickness and contaminant characteristics are listed in 
Table 6. 
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6.3.3. GOD Scheme 

GOD scheme is one of the earliest vulnerability index methods. GOD 
rating system is an empirical method for quick assessment of vulnerability 
incorporating three parameters: Groundwater occurrence including recharge, 
Overlying lithology, and Depth to groundwater (Foster, 1987). To each 
category of these parameters — namely the aquifer type (e.g. confined, semi- 
confined, unconfined), the lithology of the overlying aquitard or aquiclude (in 
case of a confined or semi-confined aquifer) or the aquifer unsaturated zone 
(in case of unconfined aquifer) and the depth to water — a rating value between 
O (not vulnerable) to 1 (highly vulnerable) is assigned. The vulnerability index 
is calculated using the following formula (Foster, 1987): 


GOD... = Gpr X Opg x Dg 


index (25) 
where, G = groundwater occurrence, O = overlying lithology (only in case of 
unconfined aquifer), and D = depth to groundwater, and subscript R indicates 
rating of the parameters. Schematic of the GOD system for assessing aquifer 
pollution vulnerability index is shown in Figure 7. The index values also 
ranges from 0 to 1 and gives the overall pollution vulnerability. 

The vulnerability is classified into four classes according the index values 
as (i) low (GOD index<0.3), (ii) moderate (0.3<GOD index<0.5) and (iii) high 
(0.5<GOD index<0.7) and (iv) extreme (GOD index>0.7). 


Table 6. Ranges and ratings for aquifer thickness and contaminant 
characteristics for DRAMIC method 


Aquifer Thickness Contaminant Characteristics 
Range (m) | Typical Rating Characteristics Rating 
0-6 9 Stable, easy to infiltrate into aquifer 9 
6-15 7 Stable, relatively easy to infiltrate 7 
15-25 5 Stable, uneasy to infiltrate 5 
25-32 4 Relatively stable, easy to infiltrate 5 
32-40 3 mrad stable, relatively easy to 4 
40-50 2 Relatively stable, uneasy to infiltrate 3 
>50 1 Unstable, easy to infiltrate 3 

Unstable, relatively easy to infiltrate 2 
Unstable, uneasy to infiltrate 1 


After Wang et al., 2007. 
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The method could not get wide popularity, although its performance has 
been assessed by applying it in GIS platform in some recent past studies 
(Debernardi et al., 2008; Polemio et al., 2009; Kazakis and Voudouris, 2011). 


6.3.4. AVI Rating System 
In this method, two physical parameters are considered: the thickness of 
every sedimentary unit above the uppermost, saturated aquifer surface (d) and 
the estimated hydraulic conductivity of each of these sedimentary layers (k). 
Firstly, hydraulic resistance (c) is calculated by the following equation 
(Stempvoort et al., 1992): 


n d; 
2 


(26) 


where, n = number of sedimentary units above the aquifer; d = thickness of 
each sedimentary unit above the uppermost aquifer, and k = estimated 
hydraulic conductivity of each sedimentary unit. The parameter c is defined as 
a theoretical factor used to describe the resistance of an aquitard to vertical 
flow (e.g. Kruseman and de Ridder, 1990). 

The hydraulic resistance (c) has dimension of time, which indicates the 
approximate travel time for water to move by advection downward through the 
various porous media above the uppermost saturated aquifer surface. 
However, it should be noted that, in a strict sense, c is not a travel time for 
water or contaminants. 

The calculated c or log(c) values can be used directly to generate iso- 
resistance map by using geostatistical techniques for spatial interpolation in 
GIS. The parameter c is related to a qualitative Aquifer Vulnerability Index 
(AVD by a relationship, shown in Table 7. 

The AVI rating system of aquifer vulnerability index, originally developed 
and applied in Canada (Stempvoort et al., 1992), has been demonstrated by a 
successful application in a GIS environment by Stempvoort et al. (1993). 


6.3.5. SINTACS Method 

The SINTACS method (Civita, 1994; Civita and De Maio, 2000), partially 
derived from DRASTIC, retains only the structure of DRASTIC. It evaluates 
the vertical groundwater vulnerability using the same seven parameters: 
Soggiacenza (depth to groundwater), JInfiltrazione (recharge action), 
Nonsaturo (attenuation potential of the vadose zone), Tipologia della 
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copertura (attenuation potential of the soil), Aguifero (hydrogeologic 
characteristics of the aquifer), Conducibilita (hydraulic conductivity) and 
Superficie topografica (topographic slope). However, the SINTACS method is 
more flexible to ratings and weights of the parameters than DRASTIC method. 


Table 7. Relationship between Aquifer Vulnerability Index (AVI) 


and hydraulic resistance 


Hydraulic Resistance (c) Log(c) AVI 

0-10 year <1 Extremely High 
10-100 year 1-2 High 

100-1000 year 2-3 Moderate 
1000-10000 year 3-4 Low 

>10000 year >4 Extremely Low 


After Stempvoort et al., 1992. 
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Figure 7. Schematic of GOD method for assessing aquifer pollution vulnerability. 
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The SINTACS method can easily be integrated with GIS where each 
parameter is first computed and mapped over a space in the form of raster 
map. Thereafter, each mapped parameter is classified into ratings (ranging 
from 1 to 10), which have an impact on potential pollution. Weight multipliers 
are then used for each parameter to balance and enhance their importance. 

Then SINTACS vulnerability index (1y), defined as weighted sum of the 
seven parameters, can be computed as (Civita, 1994): 


I= Ye, xW,) 
a (27) 


where, P; = rating of i" of seven parameters, and W; = associated weight of i” 
parameter. The weight classes used by SINTACS depend on the 
hydrogeological features of each area. 


6.3.6. EPIK Method 

The EPIK method, through several evaluations, proved to be a suitable 
parametric weight and point tool to quantify the vulnerability of karstic 
(carbonate) aquifer zones. 

Considering the karst aquifer’s geological, geomorphological and 
hydrogeological characteristics, the four parameters influencing flow and 
transport in karst taken into account by the method are as follows (Doerfliger 
and Zwahlen, 1995; Doerfliger and Zwahlen, 1998; Doerfliger et al., 1999): 
Epikarst, Protective cover, Jnfiltration condition and Karst network 
development. Descriptive information about the attribute features for each of 
the four parameters may be found in Barrocu et al. (2007). 

The parameters of the EPIK method constitute a protection index, F to be 
calculated for all parts of the catchments by weighted linear combination 
technique as follows: 


Fp; =a E; +bP +cI, +dK, (28) 


where, i = 1,..,n is the grid cell number; E;, P;, I, K; = weights considered for 
the i” cell; a, b, c, d = attribute relative weights (constant for any attribute); Fp; 
= i" cell protection factor (pertaining to i" cell). The lower the value of 
protection factor calculated for any i™ cell, the higher the vulnerability of the 
karst aquifer. The step-by-step methodology for applying the EPIK method is 
shown in Figure 8. 
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6.3.7. GLA Method 

The GLA (Geologisches Landesamt) Method, first proposed by Hoelting 
et al. (1995), is based on a point count system similar to the DRASTIC 
method. The GLA method was further developed by Goldscheider (2000) into 
the PI-method within the framework of the European COST 620. 


Geomorphology |f f| Soil Properties Satellite SRTM DEM Geology 
Imagery Data 


Figure 8. Flowchart showing step-by-step methodology for applying GIS-based EPIK 
method. 


Unlike the DRASTIC, the GLA-method only takes the unsaturated zone 
into consideration. Attenuation processes in the saturated zone are not 
included in the vulnerability concept. 

Perhaps, consideration of only unsaturated zone is the major reason that 
the method could not get wide popularity and applicability. 

In this method, the degree of vulnerability is specified according to the 
protective effectiveness of the soil cover and the unsaturated zone. Six 
parameters considered for the assessment of the overall protective 
effectiveness are as follows (Hoelting et al., 1995): 


Parameter 1: S- effective field capacity of the soil (rating for FCe in mm 
down to | m depth) 

Parameter 2: W- percolation rate 

Parameter 3: R- rock type 

Parameter 4: T- thickness of soil and rock cover above the aquifer 
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Parameter 5: Q- bonus points for perched aquifer systems 
Parameter 6: HP - bonus points for hydraulic pressure conditions (artesian 
conditions) 


The protective effectiveness (PT) is calculated using the following 
expression (Hoelting et al., 1995): 


PT =P1+P2+Q+HP (29) 


where, P1 = protective effectiveness of the soil cover; and P2 = protective 
effectiveness of the unsaturated zone (sediments or hard rocks) 
Parameters P1 and P2 are defined as follows: 


Pl=Sx W (30) 


P2=Wx(RIxT14+ R2xT2+...... + Rn x Tn) (31) 


Based on the German mapping approach, the highest value assigned for 
factor W, is 1.75 for an annual groundwater recharge of less than 100 mm 
(Hoelting et al., 1995). A modified scale for the factor W was introduced 
which reflects the low amounts of groundwater recharge in many areas (Table 


8). 


6.3.8. PI Method 

The PI method is used for mapping the intrinsic vulnerability of 
groundwater resources to pollution through a GlIS-based approach 
(Goldscheider et al., 2000). This vulnerability method is applicable to all kind 
of aquifers, but provides special methodological tools for the karst aquifers. 
Conceptually, the method is based on an origin-pathway-target model. The 
land surface is taken as the origin for the contaminant, the water table in the 
aquifer is the target which is vulnerable to contamination, and the pathway 
includes all geologic layers in between. Aquifer vulnerability is assessed as the 
product of two factors: (i) protective cover (P) and (ii) infiltration conditions 
(D. The detailed assessment schemes for the two factors can be found in 
Goldscheider et al. (2000), Goldscheider (2004) and Zwahlen (2004). 

PI method can be expressed as (Goldscheider et al., 2000): 
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where, p = protection factor; P = parameter representing protective cover 
conditions; and I = parameter describing infiltration conditions. Vrba and 
Zaporozec (1994) proposed five classes of vulnerability (or protectiveness, p) 
ranging from 1 to 5: value of p = 1 indicates a very low degree of protection 
and an extreme vulnerability to contamination, whereas a p = 5 indicate a very 
high degree of protection and a very low vulnerability. 

In the PI method, the parameter P describes the protective function of all 
subsurface layers that may be present between the ground surface and the 
groundwater table: the topsoil, the subsoil, the non-karst rock and the 
unsaturated zone of the karst rock. 

Protectiveness is assessed on the basis of the effective field capacity (FC.) 
of the soil, the grain size distribution (GSD) of the subsoil, the lithology, 
fissuring and karstification of the non-karst and karst rock, the thickness of all 
strata, the mean annual recharge and artesian pressure in the aquifer (Kouli et 
al., 2008). The parameter P is classified into five classes according to its value 
ranging from P=1 (extremely low degree of protection) to P=5 (very thick and 
protective overlying layers). A decadic (10 point) logarithmic scale is applied 
to make the parameter P one class higher and to show a ten times higher 
protectiveness (e.g. 10-m-layer thickness instead of 1 m). The parameter I, 
which is very critical for karst aquifers, describes the infiltration conditions. 
This parameter, in particular, given an idea about the degree to which the 
protective cover is bypassed due to lateral surface and subsurface flows that 
enter the karst aquifer at some another place. Values of the parameter vary 
from 0 (steep slopes with low permeability soil) to 1 (for horizontal and highly 
permeable soil). On steep slopes of low permeability, surface runoff will be 
diverted towards a sinking stream while on a horizontal plane of high 
permeability, diffuse recharge occurs by infiltration and subsequent 
percolation. In such a case, the protective cover will entirely be bypassed. For 
rest of the situations, intermediate values (0.2, 0.4, 0.6 and 0.8) of the I 
parameter are assigned depending on the soil properties controlling the 
predominant flow process, the vegetation and slope gradient, and the position 
of a given point inside or outside the catchment of a sinking stream. 

In GIS application of the PI method, raster maps of the parameter P and I 
are to be considered, which may be generated through raster-based spatial 
analyses performed in GIS. 

Finally, multiplication of P and I raster maps can be accomplished in GIS 
and resulted p factor map can be classified into suitable classes to identify high 
and low vulnerability areas. 
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Table 8. Modified values of the parameter W (percolation rate) 


Groundwater Recharge (mm/year) Percolation Rate (W) 
>400 0.75 

300-400 1 

200-300 1.25 

100-200 1.5 

50-100 1.75 

25-50 2 

<25 2:25 


6.3.9. COP Method 

The COP method of groundwater vulnerability assessment is mainly 
developed for the carbonate (karst) aquifers. This method provides assessment 
of intrinsic vulnerability of the aquifers based on three factors: flow 
Concentration, Overlying layers and Precipitation. 

According to European approach (Daly et al., 2002; Goldscheider and 
Popescu, 2004), the basic concept of this method is to assess the natural 
groundwater protection (O factor), which is determined by the properties of 
overlying soils and the unsaturated zone. 

The method also aimed at estimating how the groundwater protection can 
be modified by the infiltration process (i.e., diffuse or concentrated) defined by 
C factor and the climatic conditions (e.g., precipitation) defined by P factor 
(Kouli et al., 2008). 

Furthermore, the COP method establishes detailed guidelines, standard 
tables and formulae for vulnerability assessment and selects suitable variables, 
parameters and factors to be used according to the European Approach (Daly 
et al., 2002; Zwahlen, 2004). 

The method can have wide acceptance in most countries of the world as 
the geoenvironmental data required by the method is easily available with 
some fieldwork but no extensive input from GIS is needed. 

Moreover, the method is applicable in different climatic conditions and 
different types of carbonate aquifers, e.g. diffuse and conduit flow systems. 
These flexibilities associated with the COP method make the method more 
practical and useful for planners and decision makers framing and 
implementing suitable schemes of groundwater protection. 

The COP method, comprising of the three factors to evaluate the intrinsic 
vulnerability of a groundwater resource, is expressed by the following formula 
(Daly et al., 2002): 
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COPs =CxOxP (33) 

Scores to all three factors are assigned according to their relative impact 
on the vulnerability of the karst aquifers. The numerical representations of the 
C, O and P factor values (or scores) are then multiplied to assess the 
vulnerability. In general, the final values of the COP index indicating the 
intrinsic vulnerability range from 0 to 15, which can be suitable classified into 
five vulnerability classes, i.e. very high, high, moderate, low and very low 
vulnerability (Vrba and Zaporozec, 1994). 

The COP method is evaluated as the most effective in comparison to other 
methods such as DRASTIC, GOD, AVI, SINTACS, EPIK, and PI for 
assessing the prevailing vulnerability in the southern Spain (Longo et al., 
2001; Brechenmacher, 2002; Vias et al., 2005, 2006; Andreo et al., 2006) 
based on actual hydrogeological understanding of the aquifers. 


7. GIS-BASED WATER QUALITY INDEX 


Water Quality Index (WQI) technique is very useful for evaluating the 
water quality (Abassi, 1999; Adak et al., 2001; Pradhan et al., 2001), 
especially in resource-poor countries where cost is a major issue for water 
resources management. In one of the pioneer work, Horton (1965) developed 
general water quality indices by selecting and weighting several parameters. 
Although there are no hard and fast rules for constructing a water quality 
index, a WQI should be specific to a water use or a set of goals (Schultz, 
2001). In general, two steps are required for developing a WQI. First, a set of 
parameters need to be selected that measure the important physical, chemical, 
and microbiological water characteristics. Of course, the selection of such 
parameters depends on the intended use of the water. 

Once information about that set of parameters is available, a rule is needed 
to summarize all the information in a unique number, i.e., “water quality 
index’. The usefulness of water quality indices has been demonstrated in water 
quality interpretation (e.g., Melloul and Collin, 1998; Soltan, 1999; Stigter et 
al., 2006; Babiker et al., 2007; Ramesh et al., 2010; Machiwal et al., 2011; 
Machiwal et al., 2013). Provencher and Lamontagne (1977) proposed one 
pioneering WQI, which is based on several parameters scored using the same 
transformations, generally but not always linear, and a final global score is 
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reached. In the past, a variety of water quality indices have been proposed by 
researchers worldwide (Table 9). 
Geographic Information System (GIS) provides an efficient environment 


for the development of a WQI. 


Table 9. Different water quality indices developed and used 
in the earlier studies 


S. No. |Name of Index Country Parameters Used'in Water Quality Source 
Index 
F, NO3, UO,, As, B, Ba, Cd, Cr, Ni, 
Finland Pb, Rn, Se, pH, KMnO, 
Groundwater eee consumption, SO,, Cl, Ag, Al, Cu, 
Seen Backman et 

1 Contamination Fe, Mn, Na and Zn al. (1998) 

Index TDS, SO,, Cl, F, NO3, NHg, Al, As, ` 
Slovakia |Ba, Cd, Cr, Cu, Fe, Hg, Mn, Pb, Sb, 
Se and Zn 
Groundwater Melloul and 

2 (Onati Indez: nee -— dene ee Collin (1998) 
Groundwater NO;, PO3, Cl, TDS, BOD, Cd, Cr, Ni 

z Quality Index Egypt and Pb soba) 
Surface Water and Temperatire, mineralization, Štambuk- 

: corrosion coefficient, DO, BOD, oS, ie 

4 Groundwater Croatia : Giljanović 

Quality Ind total N, protein N, total P and total (1999) 
ee oai coliform 

Groundwater 

Quality Index and : 
5 Groundwater Portugal {NO 3, SO,, Cl and Ca Stigter ctal; 

ee (2006) 

Composition 

Index 

Surface Water ; Temperature, hardness, DO, pH, EC, Vignolo et al. 
6 Quality Index Argentina |alkalinity, turbidity, NO3, NO», NH3, (2006) 

y Cl and SO, 

Malaysian 

Department of __ DO, COD, BOD, TSS, NH;-N and |SPuPaimi- 

7 Environment — Malaysia H Othman et al. 
Surface Water P (2007) 
Quality Index 
Groundwater Babiker et al. 

8 Quality Index Japan Cl, Ca, Na, Mg, SO4, TDS, NO; (2007) 

9 Surface Water Spain pH, EC, TSS, NH3, NO, NO3, COD, |Sanchez et al. 
Quality Index p BOD, DO, temperature and total P |(2007) 
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Table 9. (Continued) 


S. No. |Name of Index Country Paranicters Used in: Water Quality Source 
Index 
Fuzzy Surface Temperature, pH, DO, BOD, Lenonova 
10 Water Quality Brazil Coliforms, dissolved inorganic N, al. (2009) 
Index total P, total solids and turbidity 
pH, EC, Na, Cl, SO,, total alkalinity, 
11 Groundwater India total hardness, Ca, Mg, Fe, F, NO3, Ramesh et al. 
Quality Index NO,, Mn, Zn, Cd, Cr, Pb, Cu, Ni, (2010) 
total coliform, salmonella 


In brief, the GIS-based WQI formulation process involves generation of 
representations for the spatial variability of originally scattered point 
measurements and the multiple transformations of water quality data into a 
corresponding index rating value related to water quality. The steps involved 
in the formulation of GIS-based WQI proposed Babiker et al. (2007) are 
described in the subsequent section. 


7.1. Computing Normalized Difference Maps 


In the first step, spatial maps (C) representing distribution of 
concentrations of the water quality parameters over the space are constructed 
for each parameter from the point sample values by spatial interpolation 
technique within GIS environment. 

Thereafter, observed spatial concentrations (Cobs) of the water quality 
parameters are related to their maximum desirable limits (Cma) prescribed by 
the WHO (2006) on pixel-by-pixel basis using a GIS-based normalized 
difference index (NDindex) as follows (Babiker et al., 2007): 


ND; = (Cas =C Cons + Ca) 


index 


(34) 


Values of the resulted NDingex for each pixel range between -1 and 1. 
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7.2. Assigning Rank to Different Water Quality Variables 


The NDinaex Maps are rated between 1 and 10 to generate a ‘rank map’. 
The rank 1 indicates minimum impact on water quality, while the rank 10 
indicates maximum impact. The minimum NDingex value (-1) is set equal to 1, 
the median value (0) is set equal to 5 and the maximum value (1) is set equal 
to 10. The following polynomial equation can be used to rank the 
contamination level (or NDindex) of every pixel between 1 and 10: 


R =0.5x (ND pa) + 4.5 (ND ge.) +5 


(35) 


where R = rank value of every pixel corresponding to its NDingex value. 


7.3. Developing Water Quality Index Map 


Water Quality Index (WQI) is calculated as follows (Babiker et al., 2007): 


WQI =100-[(R, w, +R, w, +...+R, w,)/N] m 


where R = rate of the rank map (1-10), w = relative weight of the parameter 
which corresponds to the ‘mean’ rating value (R) of each rank map (1—10), 
and N = total number of parameters used in the suitability analysis. 

The definition of WQI [Equation (35)] is similar to the weighted linear 
combination technique. The weight (w) assigned to each parameter indicates 
its relative importance to water quality and corresponds to the mean rating 
value of its ‘rank map’. The total number of parameters (N) involved in the 
expression of WQI averages and limits the index values between 1 and 100. 

The ‘100’ in the first part of the formula is incorporated to directly project 
the WQI value such that high index values close to 100 reflect ‘high water 
quality’ and the index values far below 100 (close to 1) indicate ‘low water 
quality’. The entire steps of developing a water quality index map are depicted 
in a flowchart shown in Figure 9. 
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8. GIS-COUPLED MULTIVARIATE 
STATISTICAL TECHNIQUES 


Multivariate statistical analyses techniques such as principal component 
analysis (PCA) and cluster analysis (CA) are very useful for classifying 
aquifer groundwater quality according to the different pollution sources. It is 
observed that the results of the multivariate statistical analyses of water quality 
data can easily be combined with GIS in order to delineate the different 
groundwater quality zones. 

Mapping of groundwater contamination is often complicated by infrequent 
and uneven distribution of sampling locations, analytical errors in sample 
analyses, and large spatial variation in observed contaminants over short 
distances due to complex hydrogeologic conditions. 

Also, uncertainty may be associated with numerical modelling approach 
used to delineate groundwater contamination plumes due to inadequate 
knowledge about local hydrogeological conditions. 


Generating Rank Maps 
— 
—— 


Figure 9. Flowchart depicting methodology for developing GIS-based water quality 
index maps. 
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Furthermore, managing and mapping extensive water quality datasets can 
be difficult due to the multiple locations, times, and analytes that may be 
present. An alternative to numerical modeling is to employ statistical analysis 
of groundwater quality data to infer zones of potential contamination. 

Principal components analysis (PCA) is a multivariate statistical 
technique, which classifies/groups the water quality variables based on their 
correlations with each other. The major aim of applying PCA and CA is to 
consolidate a large number of observed water quality variables into a smaller 
number of factors that can be more readily interpreted. Thus, the multivariate 
statistical techniques reduce dimensionality of the data (Dillon and Goldstein, 
1984). The PCA helps identifying underlying geologic and hydrogeologic 
processes for individual principal components or PCs (or factors) based on the 
water quality variables grouped under the PCs. The more PCs extracted, the 
greater is the cumulative amount of variation in the original water quality data. 
PC loadings show how the PCs characterize strong relationships (positive or 
negative) between groundwater quality variable and PC describing the 
variable. In order to determine the number of PCs to be retained, Kaiser 
Normalization Criterion (Kaiser, 1958) is used. 

PCs, which best describe the variance of analyzed groundwater quality 
data (eigenvalue > 1) and can be reasonably interpreted (Harman, 1960), are 
accepted for further analysis. 

The measure of how well the variance of a particular groundwater quality 
parameter is described by a particular set of factors is known as ‘communality’ 
(Jackson, 1991). Number of variables retained in principal components or 
communalities is obtained by squaring the elements in PC matrix and 
summing the total within each variable. Ideally, if a PCA is successful, 
number of PCs will be small, communalities are high (close to 1) and PCs will 
be readily interpretable in terms of particular sources or process (Dunteman, 
1989). The PCA has previously been used to generate accurate maps of 
monitoring wells grouped by their water quality characteristics (Suk and Lee, 
1999; Ceron et al., 2000; Giiler et al., 2002). 

Suk and Lee (1999) performed multivariate statistical analysis in 
combination with GIS to correlate contaminant data with groundwater quality 
parameters for the purpose of identifying contaminated aquifer zones. 

Cluster analysis (CA) is another multivariate statistical analysis technique 
that results in data reduction and that can be used to group monitoring sites 
according to aquifer water quality behaviour (Suk and Lee, 1999). The CA is 
an unsupervised pattern recognition technique that uncovers intrinsic structure 
or underlying behaviour of a dataset without making a priori assumption about 
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the data, in order to classify the objects of the system into clusters based on 
their similarities (Otto, 1998). This method creates linkages between variables 
hierarchically in the configuration of a tree with different branches. Branches 
that have linkages closer to each other indicate a stronger relationship among 
variables or clusters of variables. Mathes and Rasmussen (2006) demonstrated 
the methodology for generating GIS maps of groundwater contamination using 
multivariate statistical analysis of water quality data. GIS is an important tool 
that is used to organize and manage large amounts of water quality 
information for use in decision support systems. 

Nowadays, GIS has been started to be used routinely for displaying water 
quality data in map form but still use of statistical indicators of contaminant 
distributions are rarely seen. Presently, focus of GIS-coupled multivariate 
statistical analysis techniques has shifted from mapping of observed 
contaminant distribution to developing a map of contamination potential 
created using auxiliary water quality data. 

Prior to applying multivariate statistical analysis, generally the observed 
water quality data, x, are standardized by z-scale transformation as given 
below: 


j (37) 


where, xj; = value of the j™ water quality parameter measured at i™ site, X j= 


mean (spatial) value of the j” parameter, and sj = standard deviation of the j” 
parameter. 

The analysis performed with standardized data is expected to be less 
influenced by small/large variance of the data. Furthermore, standardization of 
the data removes the influence of different measurement units of the data by 
making the data dimensionless. 


CONCLUSION 


Evaluation of water quality is necessary for managing water quality so as 
to ensure environmental sustainability. The important aspects of water quality 
evaluation are interpretation of water quality variables, reporting of results and 
recommendations for planners and decision makers. Logical sequence of any 
water quality evaluation programme consists of three key steps: monitoring, 
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evaluation and management. There are various tools and techniques available 
for water quality interpretation. However, selection of appropriate tools is very 
crucial for making the water quality evaluation to be effective. 

Among the several tools used for water quality assessment, geographic 
information system (GIS) has been gaining a wide acceptance for past two 
decades among the researchers worldwide due to the capabilities of GIS such 
as handling, capturing, storing, analyzing and displaying large quantum of 
water quality data. 

It is clear that with the advent of GIS technique, many conventional 
methods of water quality evaluation have been interfaced with GIS to enhance 
usefulness of the methods. The conventional methods coupled with GIS can be 
applied over relatively large areas. GIS-based spatial statistical analyses make 
it possible to explore spatial and temporal variations of the water quality. The 
point data of the water quality may accurately be converted to vector and 
raster formats through integration of geostatistical and GIS modeling 
techniques. Raster formats of the water quality data further enable spatial 
analyses to be performed under GIS platform for groundwater vulnerability 
mapping. 

Among the various overlay and index methods for mapping groundwater 
vulnerability, DRASTIC, GOD, AVI, and SINTACS are mostly applied in 
many places of the world. Later on, few specific vulnerability methods 
applicable to karst (carbonate) aquifers were developed, e.g. EPIK, GLA, PI 
and COP. 

Computation of water quality index is another way of evaluating the water 
quality where GIS technique plays a central role. With the aid of GIS, it is 
possible to locate areas having poor quality of the groundwater. The definition 
of WQI is flexible and many researchers have developed different types of 
water quality indices depending upon the data availability, aim of assessment, 
geologic condition, aquifer type, etc. Recently, GIS has also been associated 
with multivariate statistical analysis techniques, e.g., principal component 
analysis (PCA) and cluster analysis, etc. 

Finally, GIS is considered as a modern powerful tool having large 
flexibility to be combined with conventional methods of water quality 
evaluation. 

This new era tool has great potential to play a key role in water quality 
evaluation to ensure sustainable management of natural resources. In future, 
new areas for the assessment of water quality are to be explored wherein the 
application of GIS technique will further strengthen the interpretation of water 
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quality analyses ultimately leading towards more sustainable planning and 
utilization of the water resources. 
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ABSTRACT 


Flood hazard has become the most frequent natural disaster and has 
provoked global responsiveness. Due to its devastating nature, many lives 
have been lost, natural ecology degraded, disrupted social and economic 
activities and destroyed properties worth billions of dollars. Such over- 
whelming effect is felt more in urban centers especially those located in 
coastal regions. Estimating flood risk was a complex multi-faceted 
problem due to the level and amount of knowledge in systematic discipli- 
nes such as geography, geomorphology, climatology, hydrology, hydrau- 
lic engineering and urban planning that need to be combined. 

Presently, this problem has been surmount with the introduction of 
geographic information system (GIS) which when properly integrated 
with remote sensing technique, has the capability to transform the manner 
and way of modeling flood risk and extracting spatial information to 
support decision making processes. 
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The substantive objective of this chapter is to examine the formida- 
bility of using high resolution remote sensing data and GIS techniques to 
assess and identify flood prone areas before occurrence in Lagos State- 
large coastal city of Nigeria. The GIS-based flood risk methodology so- 
developed for a littoral urban region proved to be helpful in extracting 
flood prone areas based on elevation from SRTM digital elevation model 
(DEM) and proximity to source of hazard. Such risk areas were then 
classified into magnitudes of potential risk and five classes were 
identified-very high, high, moderate, low and very low. This extracted 
flood mask was further used to estimate the proportion of agricultural 
land and urban land likely to be affected in the event of a flood episode. 
To support grass root policy, the areal calculation was disaggregated into 
local government area territory. Furthermore, land use/ land cover data of 
the study region was extracted from Landsat image using a supervised 
classification method based on maximum likelihood algorithm. The 
extracted potential flood risk masks were overlaid on the land cover data 
so as to assess the likely impact of flood on the various land uses- 
agricultural land use and urban land use. In the same way, the identified 
flood prone area masks were entered into Google earth engine for the 
purpose of quick visual impression and mapping the flood vulnerable 
areas by neighborhood and road infrastructures. The five-class vulnerabi- 
lity feature was then overlaid on the local administrative map data loaded 
with the projected population figure of the study region. From here the 
vulnerable population was estimated by Local Government Area (LGA). 
The results of the study show that GIS technique is a formidable tool for 
flood risk analysis, mitigation and pre-hazard planning. This can be seen 
from the series of thematic maps that were generated which were used to 
develop a large GIS-assisted database. It is evident that the database so- 
generated will facilitate flood risk management and provide an effective 
framework that will support policy formulation. 


1. INTRODUCTION 


Floods have become the most frequent and widespread natural disaster 
which has claimed many lives, degraded natural ecology, disrupted social and 
economic activities, destroyed properties and farmlands worth billions of 
dollars (Nkeki et al., 2013; Taubenbock et al., 2011; Thieken, et al., 2014; 
Ologunorisa, 2006). Some examples are the 1972 Rapid City, South Dakota 
flood in the United States which claimed about 238 lives and caused millions 
of dollars in damage, the 1975 Banqiao dam flood in China which drowned 
about 26,000 people and caused another 140,000 death as a result of epidemic 
outbreak, the 1983 Pacific flood, northwest of the United States destroyed pro- 
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perties estimated to worth 1.1 billion dollars, Yangtze river flood of 1998 
displaced 14 million people, claimed hundreds of lives and damaged proper- 
ties worth billions of pounds. 

Recent examples are the 2012 Thailand devastating flooding episode 
which flooded 65 of 77 Provinces of the territory and resulted in a total of 815 
deaths, left about 13.6 million people homeless and submerged 20,000 square 
kilometers of farmland. The 2012 Niger-Benue river floods in Nigeria affected 
14 of the 36 states in the country, displaced an estimated 1.3 million people, 
claimed about 431 lives and submerged over 1,525 square kilometers of farm- 
land. 

The devastating effect of flooding is felt more in urban centers especially 
those located in coastal regions. This is because, on the one hand, the terrain is 
generally low and the presence of numerous water bodies (such as lake, 
lagoon, river, creek, etc.) which are prerequisite and essential source of 
flooding. On the other hand, littoral cities are prominent centers of high popu- 
lation concentration and prosperous economic region. 

The economic boom that characterized such cities further initiate high 
population density. Consequently, there is need for more space with respect to 
residential and other developmental purposes. To cope with this rapid growth 
and satisfy the need, developmental activities spring up sporadically around 
the available urban space. In most cases, along sea shoreline, low lying terrain, 
close to swamp, lagoon and on a river valley. The fact that urban planning 
regulations in most African cities are not keenly enforced further aggravate the 
issue of flooding and vulnerability. 

In Nigeria littoral cities for instance, it is common to find developmental 
activities and structures erected along natural flow paths, near shoreline, 
floodplain and lowland areas. Such anthropogenic actions obstruct excess 
runoff and discharge in the case of river flooding which has been reported 
(Sanyal and Lu, 2004) to be a recurrent natural phenomenon in the humid, 
tropical and subtropical climatic regions, especially in the wet seasons. 

Basically, three major types of flooding occur in African coastal cities- 
river flooding, coastal flooding and urban flooding. River flooding is induced 
by heavy rainfall, excess runoff, and discharge within the river valley and its 
destructive impact is a function of distance from natural channel. Coastal 
flooding is typically a function of storm surge, waves (driven by wind) and 
heavy rainfall. Urban flooding results when development is concentrated 
within or along stream channels (Nkeki et al., 2013). 

The consequences and severity of flooding is graver in littoral cities 
compared to cities in the hinterland. Apart from submergence of urban pro- 
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perties, loss of lives, destruction of agricultural lands and crops, disruption of 
socioeconomic activities, environmental disfiguration, inflow into sewage 
(causing municipal pollution), it also leaves a long lasting contaminating effect 
such as the intrusion of salt water into the soil, surface water and groundwater 
reserve. This may pose serious ecological distortion and initiates a new level 
of epidemic outbreak, scarcity of potable water and destruction of arable land 
for agricultural purposes (leading to shortage of food). 

The issue of flood risk can be understood and linked to two major factors- 
the flood hazard itself and vulnerability. Flood hazard is a potentially destruct- 
tive weather induced physical incident that is characterized by its location, 
intensity, frequency and probability (Taubenbock et al., 2011). The second 
factor-Vulnerability, represent the system or phenomenon (e.g. people, 
specific ecological configuration, land use, urban system, political component 
etc.) that is directly or indirectly exposed to the disastrous effect of the natural 
hazard. Thus, in an urban center the impact of flooding weighs heavily on the 
demographic, physical, social, economic, ecologic and political compositions 
of the affected area. Although, flood hazard is climatologically driven, the 
drive for increase in productivity, uncontrolled urban sprawl and rapid 
population growth in such urban area upsurges the risk and susceptibility. 

However, urban flooding is directly and indirectly a function of man’s 
interaction with the physical environment. Such interaction involves designing 
and locating infrastructure, exploring and exploiting natural resources, 
concentration of population and urbanization (Hualou, 2011). Indirectly, these 
activities have arguably increased flood risk through climate change and 
directly through anthropogenic disturbances, obstruction and alteration of 
natural flow paths and floodplains. 

Since flooding is imminent in urban areas, formulating management 
strategy is the primary approach to achieving sustainable development. This is 
the case for advanced countries were technological knowledge is high and the 
availability of funding to accomplish any set objective and project pertaining 
to flood hazard such as hazard preparedness, quick response, awareness and 
enlightenment-vital information on flood prone areas and the spatial extent. 

This is critical for most developing countries in Africa were hazard 
preparedness is not foremost and the existing flood management policies and 
strategies (mostly outdated and ill-prepared) are not enforced and executed to 
the latter. In most cases, such strategic plans are dedicated and geared towards 
compensating flood ravaged communities and dealing with the aftermath. 

In these countries, resources (finance, information) for developmental 
activities are limited owing to the unstable political system. Perhaps, this 
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explains why floods of equal magnitude cause more losses and damage in 
developing countries compared to developed nations who generally have well- 
structured monitoring and early warning system (Opolot, 2013). Flood disaster 
management is a huge capital intensive program due to the amount of spatial 
information required for risk prediction and assessment, especially when 
ground-based survey method is adopted. To maximize the limited financial 
resources (in developing countries), a sustainable and cost-effective method of 
extracting and generating required information is paramount. 

The first step needed to carryout flood risk analysis and assessment is to 
identify the areas vulnerable to the hazard (Sanyal and Lu, 2004; Ishaya et al., 
2009) and delineating these areas in a map. Initially, estimating flood risk was 
a complex multi-faceted problem because various knowledge in systematic 
disciplines such as geography, geomorphology, climatology, hydrology, 
hydraulic engineering and urban planning need to be combined. 

The current advancement in geographic information system (GIS) and 
remote sensing techniques have revolutionized the method of modeling flood 
risk and extracting spatial information to support decision making processes 
and enacting of public policies. GIS technique allows the integration and 
synchronization of spatial data captured by space borne sensors orbiting the 
earth with demographic and socioeconomic data for the purpose of generating 
timely geospatial information for comprehensive risk mitigation planning 
(Tralli et al., 2005). 

The primacy of GIS technology in the field of disaster management, 
especially flood related has been reported and adopted by contemporary spatial 
scientist (Nkeki et al., 2013; Sanyal and Lu, 2004; Taubenbock et al., 2011; 
Alaghmand, et al., 2010; Samarasinghe, et al., 2010; Tralli et al., 2005; 
Triglav-Cekada and Radovan, 2013; Opolot, 2013; Zhang et al., 2008; Wang, 
2004; Nirupama and Simonovic, 2007; Zheng et al., 2008; Irimescu, et al., 
2010). 

The capability of GIS technology to visualize and analyze spatial and non- 
spatial data from diverse sources makes it a powerful platform for multilevel 
decision making and so greater credit should be given to its geovisualization 
capabilities (Nkeki, 2013a). The technique is able to generate series of maps 
that summarize vital information useful for decision making and spatial 
planning. A GIS help to manipulate remote sensing data in a spatial format and 
offer a friendly platform to integrate such dataset with non-spatial data to 
produce new maps. 

These maps are essential components and constituents for developing a 
GIS database pertaining to disaster assessment and management. GIS database 
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comprise series of map layers that are geographically referenced as well as 
attributes that can be linked to such map layers by common identifier (Vine et 
al., 1997). Thus, each of the map layers contain, in most cases, one central 
geographic theme (thematic map) of the studied phenomenon. However, one 
of the fundamental merits of a GIS is its ability to synchronize several map 
layers for on-screen visual comparison and extraction of spatial relationships. 

This chapter focuses on the formidability of using high resolution remote 
sensing data and GIS techniques to assess and identify flood prone areas 
(estimating population and infrastructure at risk) before occurrence in Lagos 
State-large coastal city of Nigeria, and develop a more accurate, easily replica- 
ted geo-based flood risk model for sustainable mitigation strategies and policy 
plans. 


2. THE STUDY AREA 


Administratively, Lagos is the smallest state in Nigeria, and on the other 
hand, this territory contains the largest urban center in the country and one of 
the largest coastal city in Africa. It is composed of 20 local government areas 
(LGA) of which 16 forms the high density metropolitan region. 

Geographically, the region is located at the south-western edge of Nigeria 
in the West African sub-region between latitude 6°22'N to 6°41'N and longi- 
tude 2°42'E to 4°21'E (Figure la). It covers an overall area extent of 
approximately 3,829.84 km? with a perimeter of 454,328.4 meters, and water 
bodies of about 1,700 km”. 

With respect to the topographical settings, Lagos can be categorized into 
three major geographic regions: a low-lying coastal zone along the Atlantic 
Ocean, consisting of beaches; a broad inland depression and flat land (sloping 
gently from the hinterland to the sea) surrounding the Lagoon and stretching 
from the eastern boundary of the state to the western border of the country 
(between Nigeria and Benin Republic), this zone comprises series of 
marshlands and mangrove wetlands with elevation ranging from between 0 
meter to 25 meters above sea level; three distinctive upland zones in the 
northern part of the state with elevation of 70 meters above sea level, the 
region’s highest point is at Iroko and Olasore (Figure 1b). The Ogun River 
constitute the primary surface water (river) that drains the region and it 
empty’s its water into the Lagos Lagoon. The state has a coastline of about 
180 km. 
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Figure 1. Location of the Study Area (a) the geographical position of Lagos State in 
Nigeria with its major water bodies (b) 3-D terrain model of the state. 


Water bodies and wetlands cover over 40 percent of the overall land area 
of the state and additional 12 percent is subjected to seasonal flooding 
(BNRCC, 2012). The region falls within the tropical rain forest belt and the 
eco-zones are predominantly wetlands and rain forest. The vegetation types 
found outside the urbanized area are mostly secondary forest, mangrove 
swamps, freshwater swamps and cultivated crops. 
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Generally, the region is characterized by a deep and poorly drained soil. 
Further detail concerning the ecological regions of the state is presented in 
Table 1. The climate of the state is the wet equatorial type influenced by its 
nearness to the equator and the Gulf of Guinea. 


Table 1. Lagos State ecological regions and their physical characteristics 


Ojo, Lagos Island, 


Deltaic basis 


plains of 1-2 % 


Ecological region Geology Topography Soil features Eco-zone 
Very deep, poorly drained 
Badagry, Ibeju-Lekki, Nearly level and moderately well 


drained soils; sandy, sandy 


Wetland 


Part of Eredo towards 
Ijebu-Ode mainly the 
boundary of Lagos and 
Ogun States. 


materials of 
sub-recent 
alluvium and 
coastal plain 
sands 


Nearly level 
plains of 1-2% 
slope 


Surulere, Eti-Osa and and tidal flats loamy or sandy clay loam 
slope 
areas close to the coast surfaces over sandy clay, 
loam sub soils. 
Deep, well drained and 
deep poorly drained soils; 
Part of Ebutemetta, sand sandy loata. loam 
Mushin and Shomolu, Nearly level to i i y A y 
ie .__ |sand or sandy clay loam 
Kosofe Agbowa, Ejinrin, Recent gently undulating 
i i surfaces over sand, sandy |Wetland 
parts of Epe and parts of |Alluvium plains of 2-4 % 
; clay, sandy clay loam, clay 
Ikorodu like the Igbogbo slope 
loam or loamy sand 
areas. : 
sometimes gravel sub 
soils. 
Very deep well drained 
Ikeja, part of Ebute- soils, loamy, sand, sandy 
Metta, Mushin and Coastal Plain Nearly level loam or sandy clay loam Rain 
Alimosho, Agege, Epe, : plains with 1-2% |surfaces over sandy clay 
Sands (Alfisols) Forest 
part of Eredo, and part of slope loam, clay loam, 
Ejinrin. sometimes gravel type sub 
soils. 
Very deep to deep and 
Tensitionäl moderately deep well 


drained and few 
imperfectly drained soils; 
sand, sandy loam, or 
loamy sand surfaces over 
sandy loam, sandy clay 
loam or gravel type sandy 
clay loam sub soils. 


Rain 
Forest 


Parts of Ikorodu leading 
to Shagamu. 


Coastal Plain 
Sands 


Gently 
undulating plains 
of 2-4% 


Very deep well drained, 
and very deep poorly 
drained soils; sandy, sandy 
loam or sandy clay loam 
surfaces over sandy, loam, 
sandy clay, loam, sandy 
clay, or clay loam sub 
soils. 


Rain 
Forest 


Source: Adapted from FDALR (1995). 
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The interaction between the warm, humid maritime tropical air mass and 
the hot and dry continental air mass from the interior gives the region two 
contrasting seasons; a wet season, which usually lasts from April to October; 
and a dry season, which lasts from November to March (BNRCC, 2012). In 
the wet season, two rainfall peak periods are experienced-in the months of 
May to July and September to October. These peak periods are when acute 
flooding hazard usually occur in the state and this heightened by the poor 
surface drainage structure of the coastal lowlands (BNRCC, 2012). The mean 
annual rainfall for the state is roughly 1,657 mm with a consistently high 
temperature (about 30°C for the mean monthly maximum). 

The actual population figure for Lagos State is disputed between the 
officially presented total by the National Population Commission of Nigeria in 
the 2006 census exercise (which is 9,113,605) and a far higher figure claimed 
by the state government (which is about 17.5 million, based on the parallel 
count conducted simultaneously by the state during the National Census 
exercise). In this chapter, the official 2006 National Census figure (9,113,605 
people) is adopted. Despite this, the region remains one of the most populous 
and fast growing state in the country. At 3.2 growth rate, UNFPA population 
projection for the region is 11,867,082 and 12,252,970 in 2014 and 2015 
respectively (http://nigeria.unfpa.org/lagos.html). 

Using the UNFPA projected figure for 2014, the population density is 
currently about 3,098 persons per km’. This has serious implication for the 
available urban land, management and the implementation of the existing 
urban design, because such uncontrolled growth has led to the development of 
slum villages and illegal structures and numerous disordered settlements. 

Consequently, it has accelerated indiscriminate land reclamation, the 
encroachment of residential and economic activities into natural water paths, 
wetlands, river valleys and extremely low-lying lands and floodplains. For 
example, the Makoko slum water community that falls within the Yaba 
development area is completely resting on the waters of the Lagos Lagoon 
(Figure 2). Its population is estimated to have approached 85,840. 


2.1. Flood Hazard in Lagos State 


Over the years, flooding has become a major natural disaster ravaging 
Lagos State annually. The primary cause of floods in the region is heavy 
rainfall which increases the quantity of water discharged from rivers and 
lagoons thereby overflowing its channels and floodplains. 
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Figure 2. An aerial view of Makoko slum settlement on the Lagos Lagoon. 


In 2010 and 2012, properties estimated to worth millions of Nigeria Naira 
were ravaged in Ikorodu axis as a result of the River Ogun persistent overflow 
and the heavy rainfall induced Atlantic Ocean rise forced the Lagos Lagoon 
waters to rise above its actual channel and flow into the River Ogun flood- 
plain. The flood water submerged and ravaged major roads and residential 
buildings (Figure 3). 

It was reported (Akinsanmi, 2011) that River Ogun in October 2011 rose 
up to 4 meters as a result of the incessant rainfall in the south western part of 
the country (particularly Ogun State), this led to an overflow of the river into 
surrounding towns such as Mile 12, Owode-Onirin, Agiliti, Isheri North, Maji- 
dun, Egbeda etc. In 2013, Lekki, a high density activities area of Lagos State 
was affected by tidal flood. The flood water also found its way into Jakande 
Estate, Lekki Beach and Elegushi areas. 

The Lagos State government have put numerous measures in place to 
check the hazard which include cleaning up of block municipal drainages, 
construction of canals and storm water channels, demolition of illegal 
structures (especially those located along stream channels), fortification of the 
beach line to mitigate the effect of ocean surge and coastal erosion, early flood 
warning and enlightenment campaign. 
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Figure 3. Major roads and residential properties damaged by floods in Lagos State. 


Despite these efforts, the region still suffers from flood disaster annually 
(mostly in the wet season). The stark reality is that flooding in littoral cities 
with particularly reference to sub-Saharan Africa is difficult to prevent 
because of too numerous natural and anthropogenic factors. 

The effect of climate change has further heightened the issue-increasing 
temperature and rainfall. The best practice is to reduce risk and vulnerability 
through permanent evacuation and relocation of activities from flood prone 
areas and declare such areas as unsafe zones for human inhabitation. 

Hence, the flood prone areas must be identified spatially by producing 
well synthesized flood risk map and geospatial database system for sustainable 
development. A peculiar area with respect to flood frequency and magnitude, 
is the River Ogun floodplain. This floodplain is notorious for high degree of 
submergence (Figure 4) and its effect cut across many large surrounding 
communities such as Ikorodu, Ajegunle, Magodu, Mile 12, Owode-Elede, 
Ibeje, Oworonsoki, Ajelogo, Maiden, Agboyi, Odo-Ogun etc. 

Studies have reported (Olajuyigbe et al., 2012 and Oyinloye et al., 2013) 
that the fundamental causes of perennial flood hazard in this floodplain are in 
two folds: the first is the release of excess water (accumulated from heavy 
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rainfall) from the Oyan dam constructed on the Oyan river, a tributary of River 
Ogun in Ogun State which is an operational regulation of the management to 
ensure safety and avoid dam failure; the second is the wave induced sea level 
rise which forces its waters into the Lagos Lagoon and consequently dis- 
charging into the floodplain and surrounding lowlands. 


3. DATA 


The application of GIS techniques in flood risk analysis and modeling 
requires a wide range of spatial dataset from different sources. Such dataset 
include high resolution space borne images collected with active earth 
observation system technology. Studies (Nkeki et al., 2013; Triglav-Cekada 
and Radovan, 2013; Townsend and Walsh, 1998) in the field of spatial 
sciences have shown that the utilization of this technology and GIS platform 
has become an integrated, well improved and reliable approach in disaster and 
risk management. 

In this chapter, the Shuttle Radar Topography Mission (SRTM) remote 
sensing data was used to generate elevation components, and on the other 
hand, Landsat and Google earth datasets were used to assess the urban foot- 
print and the land use physiognomies of the study region. The former, is a 90- 
meter spatial resolution digital elevation model (DEM) developed by SRTM. 

Originally, this data is produced by National Aeronautics and Space 
Administration (NASA), and it has become a major advancement in digital 
terrain modeling and provides free and easy access to high quality elevation 
data for researchers worldwide (Nkeki and Asikhia, 2014). 


Figure 4. Aerial view of flood hazard in River Ogun floodplain. 


Flood Risks Analysis in a Littoral African City 291 


The latter, utilized the current Google earth engine database version 7.1.322 
developed in October, 2013 and Landsat 8 2013 data captured with the 
Thermal Infrared Sensor. 

The SRTM high resolution DEM with 3 arc-seconds of 90-meter 
resolution has been mosaiced into seamless (gapless) near-global coverage (up 
to 60 degrees north and south) making it one of the DEM with the wider 
coverage. The SRTM captured elevation in two spatial resolution- 30-meter 
and 90-meter. Only 90-meter spatial resolution data is available globally while 
30-meter spatial resolution (1 arc-second) data is available for US territorial 
coverage alone. SRTM carries onboard its space shuttle specially modified 
radar device that collected elevation data on the 11-day mission in February 
2000. However, it is a mosaiced elevation data delivered in 1 x 1 degree tiles 
and also in various raster formats acceptable by most GIS applica-tion. The 
digital data is freely available for download from US Geological survey’s 
EROS data center’s website. 

SRTM dataset is usually delivered in a preprocessed form which involves 
filling in the voids (especially in Version 4.1) with improved hole-filling 
algorithm which make use of ancillary data sources. The mission uses a dual- 
antennae, single-pass interferometric synthetic aperture radar (INSAR) and 
operates at a wavelength of 5.6 cm (C-band). Landsat 8 (L8) became fully 
operational in April 11, 2013 and the satellite images the whole earth every 16 
days in an 8-day offset from Landsat 7. Two optical sensors are onboard the 
earth observation instrument (L8)-the operational land imager (OLI) and the 
Thermal Infrared Sensor (TIRS). 

Unlike other versions of Landsat data such as the Landsat multi-spectral 
scanner (MSS), Landsat thematic mapper (TM), Landsat enhanced thematic 
mapper plus (ETM+) with 4 bands, 7 bands and 8 bands respectively, L8 data 
is composed of 11 bands and it is acquired at 100-meter spatial resolution, 
which is resampled to 30-meter pixels in delivered data with 16-bit high 
quality pixel values of surface data, scaled to 55,000 grey levels. It captures an 
approximate scene size of 170 km north to south by 183 km east-west. The 
OLI sensor captures data in the visible, near infrared and shortwave infrared 
wavelength areas, including panchromatic band. The TIRS captures earth data 
in two long wavelength thermal infrared bands. The OLI has 12 meters 
circular error, 90 percent confidence and the TIRS is characterized with 41 
meters circular error, 90 percent confidence (USGS, 2013). 

Google earth engine is a powerful platform which combines the strength 
of Google search engine with satellite data, maps, terrain and 3D data to make 
both local and global geographic information available for end users (Nkeki, 
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2013b). Data from such system is high resolution images specifically adopted 
in this chapter for geovisualization and database creation. 

The most recent Google earth engine (Google earth pro version 7.1.322) 
was used in this chapter because it captures current urban footprints that have 
taken place in the study region. Other vector data were downloaded from map 
library website such as the territorial boundary of Lagos State and the local 
administrative boundaries (polygons). 


4. METHODOLOGY 


Various GIS spatial analysis, extraction and image classification 
techniques were conducted to develop a flood risk modeling methodology that 
can easily be replicated (Figure 5). This was achieved majorly with the aid of 
ESRI ArcGIS version 10.1 and Google earth pro version 7.1.322 softwares. 

The primary objectives of this chapter are to develop a comprehensive 
GIS driven methodology for flood risk analysis in a littoral mega city and to 
demonstrate the capability of using GIS technology to create a geodatabase 
system for flood risk management that contains spatially referenced informa- 
tion as the flood prone areas and their various risk magnitudes, estimated 
population and infrastructures at risk. 


4.1. Extraction, Vectorization and Reclassification of Elevation 
Data 


Elevation data is one of the principal datasets needed for GIS-based flood 
risk modeling. It is particularly useful for the generation of potential inunda- 
tion extent and classification of risk zones. 

It has been used by numerous researchers (Islam et al., 2001; Taubenbock 
et al., 2011; Nkeki et al., 2013; Townsend and Walsh, 1998; Sobowale and 
Oyedepo, 2013) to model flood risk and produce maps of potential flood prone 
areas. The SRTM high resolution DEM which has been spatially corrected and 
preprocessed to fill in the voids in the data were entered into ArcMap-ArcInfo 
environment for further processing. 

Using the DEM assembly tool extension in the software, the series of 1- 
degree elevation data tiles were merged and resampled to create a composite 
grid hydro DEM. 
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The resulting raster image was then clipped to the shape and size of the study 
area (Figure 6a) with the ‘extract by mask’ tool of the spatial analyst tool box 
of the GIS software. 


SRTM DEM Landsat 
(raster data) (raster data) 
Study area Projection (WGS 1984 Projection (WGS 1984 |, |Study area 
masking web Mercator) web Mercator) masking 
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Figure 5. Methodological framework of GIS-based flood risk modeling. 
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In the process, the DEM was transformed from the original geographic 
coordinate system to projected coordinate system (applying the World Geode- 
tic System (WGS) 1984 web Mercator). Hence, to ensure that the hydrological 
component of the DEM is correct and updated, the elevation data was leveled 
and reconditioned using the vecto-rized hydrological features (consisting of 
Lagoons and major rivers) extracted from the Landsat 8 imagery of 2013. 

The hydrologically updated SRTM data was used to generate a classified 
digital terrain model (CDTM) through vectorization procedure. The 
fundamental reason for creating the vector-based CDTM is to aid the various 
forms of overlay operation (such as spatial intersect, union, erase, spatial join 
and identity) and calculate the spatial extent or area coverage. 
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Figure 6. Processed elevation data of the study region: (a) Reconditioned DEM with 
vectorized hydrological constituents (in lapis lazuli blue); (b) Extracted contour lines 
with specified contour listing; and (c) Vector-based CDTM. 


Also, to group the elevation of the region into classes of lowland and high- 
land. To achieve this (generate CDTM), the DEM grid was subjected to spatial 
analyst surface algorithm to extract list of contour lines (Figure 6b). 

The contour lines were further superimposed on the DEM data to assist 
and serve as a guide for on-screen digitalization. Thus, the final vector lines 
representing the boundaries of each elevation class were converted to polygons 
and 5 classes of elevation were identified-< 5m, 6-10m, 11-20m, 21-40m, and 
41-70m above sea level (Figure 6c). 


4.2. Land Use/Land Cover Classification 


To estimate the urban footprints of the region for up-to-date flood risk 
management and assessment, a zero percent cloud-free Landsat imagery of 
2013 (Figure 7a) was preferred because it is recent, noise-free and easy to 
classify. The Landsat data is intended for a land use/land cover detection 
analysis which will provide vital information on the actual urban extent. 

The main purpose of this analysis is to detect the urban built-up area and 
agricultural lands in order to identify and calculate the areal extent potentially 
vulnerable to flooding hazard. For that purpose the Landsat raster image was 
classified after it was transformed from geographic coordinate system to 
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projected coordinate system using the WGS 1984 web Mercator and then 
clipped to the boundary of the study region. 

To generate a land use/land cover map for the year 2013 of the region, a 
supervised image classification method was adopted using the maximum like- 
lihood algorithm of ArcGIS. The training sample data were collected from the 
original (raw satellite image) Landsat imagery and used to create a spectral 
signature file which contains the identified five classes of land cover. The five 
classes that were selected are: built-up area (high density); built-up area; 
cultivated land; Forest; and water body (Figure 7b). 
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Figure 7. Processed Landsat imagery and its derivatives: (a) Raw Landsat data; (b) 
Extracted land use/land cover data; and (c) Five level multi-ring buffered hydrographic 
data of the region. 


The reason for not selecting more classes is because urban footprints/ 
coverage is the primary concern in this chapter. The multivariate classification 
has an overall accuracy of 88 percent. This was calculated by generating 200 
random points within the limit of the region’s raster data in ArcMap and sub- 
sequently exported to Google earth so as to extract land use/land cover infor- 
mation which was later exported back to ArcMap for error matrix and accura- 
cy assessment. 

From the classified land use data, each of the parameters were converted 
to vector format to aid further analysis. The hydrographic feature was 
particularly extracted as vector so that spatial proximity analysis can be carried 
out on it. The extracted hydrographic parameter consisting of Lagos Lagoon, 
Lekki Lagoon, creeks and rivers was further edited by filling in the gaps. 
These are void created by dense vegetation cover in the Landsat imagery. A 
spatial proximity map was generated (Figure 7c) using this hydrographic data 
which involved applying a multi-ring buffer operation to identify specified 
distances from the water bodies. 


4.3. Buffering Distance Selection 


Determining the appropriate distance from the shoreline of water bodies to 
the hinterland is not universal especially across countries, organization and 
researchers. Some researchers (Adeniran et al., 2013; Oyinloye et al., 2013; 
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Oriola and Bolaji, 2012) in Nigeria adopted the 30-50 meters minimum 
setback presented by 1986 town and country planning regulation. This 
regulatory setback was enacted based on the physical appearance and 
architectural design of urban centers. In addition, the need to protect natural 
water bodies from pollution. It fails to take into consideration the general 
safety of the public with special reference to flooding. More importantly, when 
the regulation was passed (in 1986) the impact of climate change (rapid 
increase in annual rainfall and sea level rise) has not been intensely manifested 
as it is currently. 

Others (Nkeki et al., 2013; Sobowale and Oyedepo, 2013; Ikusemoran et 
al., 2013) have used previous flooding extent captured by near real-time sate- 
llite sensors and documented reports to determine the distance or areal extent 
prone to flooding disaster. In the study region, there is no available or 
accessible recent satellite image of previous flood hazard, hence this chapter 
used reports and documentted evidence on previous flooding extent to estimate 
and calculate the buffer distances. The maximum recorded flood extent in the 
study region is roughly 200 meters from the sources of the hazard (which are 
the Ogun River and Lagos Lagoon). This was estimated based on the reported 
cases during the October 2011 and April 2013 flooding. 

Also, report released by the Lagos State Government during the October 
2011 flood disaster showed that the Ogun River rise above its normal level by 
4 meters (Akinsanmi, 2011). Ogun River is the major tributary of the Lagos 
Lagoon and one of the primary source of flooding in the study area. This is 
because of the yearly release of excess accumulated water from the Oyan dam 
in Ogun State and the excess water stored in the Ikere gorge during wet 
season. 

In the month of September annually when the dam experiences a peak 
flood volume of 4,270 million m° the average excess water of 8.5 meters depth 
is uncontrollably allowed to flow downstream causing havoc along its path 
(Sobowale and Oyedepo, 2013). However, it is assumed that before it gets to 
Lagos region, about half of the flood waters might have been absorbed and lost 
to the floodplain, thus the reported 4 meters rise. This information was used to 
estimate the maximum distance that such flood water would possibly 
submerge in the region relating it to the elevation of the adjacent land (Figure 
8). The calculation shows that the approximate maximum distance likely to be 
inundated when the river rises by 4 meters above the normal level is 250 
meters. Consequent upon this, the five-ring buffer distances that were selected 
are 50, 100, 150, 200 and 300 meters. 
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Figure 8. Cross profile of Mile 12 in the Ogun River floodplain showing the possible 
maximum inundate-distance when the river rises up to 4 meters above its normal level. 


4.4. Extracting Potential Flood Risk Mask 


Identification of areas having higher flood hazard potential is the most 
effective measure to mitigate the impact of the hazard and the first step to 
formulation of sustainable flood management strategy. The issue of creating a 
reliable flood hazard map (that show areas with high degree of susceptibility) 
is one of the foremost concern within the field of flood disaster management 
(Sanyal and Lu, 2004). 

To map out the areas having high probability to be flooded, this chapter 
suggests that the likelihood for a hinterland to be flooded (as a result of rain 
induced excess runoff, waves etc.) is directly linked to the proximity of the 
hinterland to agent of flood (hydrographic features) and the closeness of such 
hinterland’s elevation to the source water level. From the multi-ring buffered 
hydrographic data and the CDTM created in previous sections, a flood risk 
mask was generated through series of spatial analysis techniques-overlay 
intersect, spatial join, union and erase functions. 

The categorization of the potential flood risk areas by magnitude was done 
based on the CDTM and proximity data (Table 2). This is because as noted by 
Nkeki et al., 2013, the devastating impact of flooding disaster decreases with 
increasing distance from the source river channel. 
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Table 2. Categorization parameters for potential flood risk by magnitudes 


Risk magnitude Elevation class (m) Buffered distance (m) 
Very high <5 50 

High 0-5 100 

Moderate 0-5 150 

Low 6-10 200 

Very low 6-10 300 
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Figure 9. Spatial distribution of flood prone areas by magnitude in the study region. 


Nonetheless, the so-generated flood risk mask (Figure 9) was overlaid on 
the land cover, administrative/population density dataset to detect and estimate 
vulnerable components such as urban land use and agricultural land use. 
Specific attention was focus on these land uses because they are high valued 
structures pertaining to flooding impact. Such land uses are economic and 
infrastructural based and also involve human lives, that is the reason behind 
their frequent use as a measure of quantifying flood damage and level of 
seriousness. 

Linking elevation data, such as SRTM DEM with high spatial resolution, 
to hydrographic dataset and progressively aligning the resulting interplay with 
satellite derived land use data is a methodology that is likely to yield a more 
accurate result and produce a reliable flood risk map, database and modeling 
procedure compared to one derived ultimately from pure hydrological 
modeling. The major limitation of this method, typically rest on the accuracy 
of the adopted DEM dataset, land use data and the overall competency of the 
analysts. SRTM data and Landsat imagery have become paramount and most 
widely used datasets among contemporary researchers. 
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The latter was complemented with Google data based on its resolution and 
distinctiveness in displaying surface features. It is a better platform as a guide 
for the analyst to assess the degree of accuracy with respect to the training data 
collected and vulnerable features. 


5. RESULTS AND DISCUSSION 


In this section, the extracted flood risk mask were related to the suscep- 
tible features with the aim of creating a comprehensive flood risk GIS-based 
database system for the area. The susceptible features are the high-valued land 
uses-urban and agricultural areas. The former includes the population and 
infrastructures (roads and rail tracks) while the latter includes cultivated and 
arable lands. 


5.1. Spatial Assessment and Analysis of Land Use Vulnerability 


The result of supervised land use classification (Figure 7b) revealed that 
urban footprint extends as much as 1,092.7 km’ covering 28.7 percent of the 
total area of Lagos State, out of this, higher density urban area covers 7.4 
percent, cultivated land use occupies 22.3 percent and forest land 31.5 percent. 
The flood risk mask which has been classified into five levels of risk-very 
high, high, moderate, low and very low was superimposed (applying spatial 
intersect) on the land use map derived from remotely sensed data to visualize 
and quantify features exposed to flood hazard. Figure 10 shows the spatial 
distribution of vulnerability to flooding disaster on different land use classes. 
The overall areal extent in the land use map potentially exposed to flood 
hazard is roughly 296 km’. From the perspective of high-valued land uses, 
such as urbanized area and arable area (which is basically the central focus of 
this chapter), the potentially endangered valued land covers approximately 107 
km? which is about 36 percent of the total vulnerable area. Additionally, more 
than 7 percent of the overall urbanized area are susceptible to the hazard. 

The results in Table 3 (extracted from the attribute table of the spatial 
analysis in ArcMap) show that more than 48 km’ of the higher density urban 
land use is likely to be affected during flood scenario. Based on evidence from 
this analysis, serious urban activities are concentrated close to water bodies 
and in most cases, in extremely low lying terrain. 
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Figure 10. Flood risk and land use vulnerability map of Lagos State. 


Such high level concentration is found around the western and southern 
edges of the Lagos Lagoon. This is why among the high-valued land use 
classes, higher density urban area is more exposed to flooding disaster. 

A cursory examination of Table 3 reveals that same character is manifest- 
ted within the vulnerability class of higher density urbanized land use catego- 
ry, i.e. the largest area of land (26.7 percent of the five classes) occupied by 
this land use category typically falls within the very high risk vulnerable class. 
It is therefore estimated that inundation extent will cover roughly 83 km? of 
building area within the urban space. 

As revealed by the spatial analysis, over 24 km’ (8.2 percent of the overall 
flood prone area) of agricultural land-including cropland and arable land is 
exposed to flood hazard. Though roughly half of this lies within the very low 
risk vulnerable class, it however exhibits grave implication for food security in 
the state. 

This is due to the stark reality that the state is experiencing disconcert and 
hysterical population inflow leading to further urban expansion and conse- 
quently dwindling contiguous agricultural lands. 
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Table 3. Vulnerability class by different land use categories 


Parameters Land use exposed to flood risk (km?) 
Proximity f : 
to source Elevation Vulnerability Urbanized Urbanized |Agricultural 
within buffer area (higher Forest 
of hazard |. class : area land 
distance (m) density) 
(m) 
50 <5 Very high 12.85 (4.4) |1.74 (0.6) |1.37 (0.5) 32.47 (11.0) 
100 <5 High 9.15 (3.1) 4.17 (1.4) [2.68 (0.9) 31.93 (10.8) 
150 <5 Moderate 6.28 (2.1) |5.28 (1.8) [3.26 (1.1) 30.00 (10.1) 
200 6-10 Low 8.01 (2.7) }8.90 (3.0) 15.95 (2.0) 39.51 (13.3) 
300 6-10 Very low 11.78 (4.0) |14.59 (4.9) |10.93 (3.7) [55.18 (18.6) 
Total 48.07 (16.3))34.68 (11.7)|24.19 (8.2) |189.09 (63.8) 
Note: values in parentheses are the corresponding percentage values. 


This can be deduced from the result of supervised classification (Figure 7b) 
which shows a pattern of intermediate or transitional relationship, where urban 
land use is engulfing agricultural lands and the latter, in turn encroa-ching into 
forest lands. 


5.2. Spatial Analysis of Urban Infrastructural Vulnerability 


Infrastructural vulnerability involves the road structure and rail track that 
are exposed to flooding hazard because they are found in the flood prone area 
of the region. These infrastructures have high economic value and are mostly 
susceptible to the hazard. For example, the asphalting of roads involve huge 
financial resources and it can easily be destroyed by insistent waterlogging. 

On the other hand, road is the primary medium for accessibility in an 
urban space and when it is flooded, it eventually obstruct economic activities. 
Road infrastructure includes expressway, major and residential roads that are 
paved, while rail infrastructure includes active rail tracks. This were further 
group and reclassified as dual and single lane. 

To map, visualize and estimate the length of road and rail track susceptible 
to flood disaster, the generated flood risk map (Figure 9) was set to 60 percent 
transparency and then converted to Keyhole Markup Language (KML) format. 
KML is an excellent format for sharing geographic data because it has the 
capability to compress graphic and non-graphic data either in raster, vector or 
both including their respective spatial references, symbology, etc. into one file. 

The flood risk data in KML format (major format acceptable by Google 
earth) was entered into Google earth engine for progressive manipulation. 
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Through this synchronization, it was possible to spatially and visually map and 
estimate vulnerable infrastructures through line digitalization within the 
Google earth engine. This platform also present an accurate geovisualization 
capability which will support quick decision making. Figure 11 gives a clear 
visual impression of vulnerable areas, building and road infrastructure in 
Google earth. 


Figure 11. Satellite map of possible flood risk area and vulnerable infrastructures. 
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The images are clipped scene from Google earth engine which constitutes 
a major part of the generated database system. A fundamental benefit of this 
modeling method, which involve locally aligning a well-integrated, carefully 
and logically generated flood risk mask with high resolution remotely sensed 
data is that it is possible to track down a specific residential building, 
monument, hotel, administrative building, reli-gious or shopping complex, as 
the case may be, that lies within the risk zone. 

From the perspective of interpretation, the result is highly simplified and 
visually alluring for policy makers who do not have expert knowledge in GIS 
and spatial analysis. Progressively, the vectorized road and rail network data- 
base was exported as KML file from Google earth engine to ArcMap as shape- 
file. This database was then overlaid (using spatial intersect technique) on the 
flood risk database to calculate the amount of accessibility infrastructures that 
are exposed to the disaster (Figure 12). 

For the purpose of detailed visualization, Figure 12 was zoomed in on the 
most urbanized area of the state-Lagos Island, Eti-Osa, Surulere, Lagos Main- 
land, Apapa, Ajeromi-Ifelodun and Amuwo-Odofin administrative areas. The 
result of the spatial database integration shows that the overall length of road 
potentially at risk is 300.91 km. High priority infrastructure like dual carriage 
road has a total length of 54.5 km exposed to flooding disaster and this is 
about 18 percent of the overall length. 
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Figure 12. An integrated geo-database for assessing susceptible infrastructures. 
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Single carriage road has a total length of 246.41 km (81.9 percent) 
exposed to flood risk (Table 4). This information will greatly assist planners 
and decision makers on how to allocate resources pertaining to flood 
eventuality, the road category to be given first attention based on the carriage 
capacity and level of usage. 

In addition, in a case of ad hoc mitigation plan and impact reduction, this 
information will be useful for prioritizing and selecting infrastructures serving 
areas of higher population concentration and high valued economic centers. 
For instance, the multiple dual carriage roads that connect the heavy density 
Lagos Mainland area with Lagos Island are of high economic value and high 
priority linkage because they connect the high density Islands to the other parts 
of the city. 

Evidence from Figure 12 reveals that such accessibility infrastructures 
crosses the five-class potential risk zones at both sides (edge of Lagos Main- 
land and Island) for a distance running to 1.2 km. A total of 0.61 km of rail 
track in the region is vulnerable to the hazard, of this 6.4 percent lies within 
the low risk vulnerability class while 93.6 percent lies within the very low risk 
vulnerability class. These sections are the two railway terminals, located at the 
Lagos Mainland near the shoreline of the Lagos Lagoon (Figure 12). 


5.3. Spatial Analysis and Estimation of Potentially Vulnerable 
Population 


Susceptibility of population to flooding is the major propellent of studying 
flood disaster and finding mitigation measures. Hence, it has become a widely 


used concept in flood risk management and vulnerability assessment. 


Table 4. Exposed road infrastructure by priority and vulnerability class 


Vatherabiligrclass Length of road exposed t9 flood E (km) 
Dual carriage Single carriage 
Very high 8.07 (2.7) 17.51 (5.8) 
High 13.77 (4.6) 27.32 (9.1) 
Moderate 8.07 (2.7) 37.58 (12.5) 
Low 9.89 (3.3) 60.32 (20.0) 
Very low 14.73 (4.9) 103.68 (34.5) 
Total 54.5 (18.2) 246.41 (81.9) 
Note: values in parentheses are the corresponding percentage values. 
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However, quantifying and estimating population exposed to risk has 
become necessary and this has further raised another line of study-how to esti- 
mate vulnerable population to hazard with minimal error. The most popular 
method is the used of density data, which basically involves sharing the popu- 
lation equally across the specified areal territory. 

A fundamental weakness of this procedure is that population is distributed 
evenly across the various land uses of the territory. Areas with no human 
settlement (such as forest, swamps and water bodies) are likewise assigned 
population value. Since population is generally known to reside within an 
urban structure, this chapter presented a better procedure that distribute 
population according to the urban structure using information from remotely 
sensed data. The maximum likelihood supervised classification data 
previously generated (Figure 7b) was disaggregated into various land uses and 
the urban footprint (areas covered with human settlements and other building 
structures) was extracted for further analysis. 

To estimate the population potentially endangered by Local Government 
Area (LGA), the vectorized local administrative element represented by 20 
polygons (corresponding with the LGAs of Lagos State) in ArcMap was 
loaded with population data for each of the polygons. To ensure a substantial 
level of accuracy in the estimation, the 2006 population census data for the 20 
LGAs of the state was projected to 2013 parallel to Landsat data of 2013 from 
which the urban footprint was extracted. The growth rate of 3.2 prescribed by 
the national population commission was adopted for the projection. 

A topological overlay of the GIS layer carrying administrative/population 
data and the urban footprint layer was performed based on attributes and 
features spatial join (Figure 13a). The urbanized area land use classes was 
merged and then superimposed on the flood risk magnitude map using spatial 
intersect. The new map so-generated captured and clipped out the urbanized 
areas within the potentially exposed LGA by vulnerability class (Figure 13b). 

Figure 14 reveals that though Ikorodu is the fourth largest LGA after Epe, 
Badagry and Ibeju/Lekki, it has the largest settled area (over 139 km”) in the 
state. In addition, 6 LGAs are at the verge of achieving 100 percent urban land 
coverage, these are Surulere, Agege, Ajeromi/Ifelodu, Ifako/Ijaye, Mainland 
and Mushin. The result shows that 15 LGAs are prone to flooding disaster in 
the state and a total of 14.587 km’, 13.318 km’, 11.559 km’, 16.899 km? and 
26.348 km? for very high, high, moderate, low and very low vulnerability 
classes respectively are exposed to flood hazard (Table 5). Among the flood 
prone 15 LGAs with an overall population of 8,423,846, an estimated total of 
172,444 are vulnerable to the hazard. 
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Figure 13. An integration of urban footprint and administrative data: (a) Overlay of 
extracted urbanized land use and local administrative/population data (b) Vulnerable 
urban land use by LGA. 


The result shows that Kosofe LGA which lies within the Ogun River 
floodplain has the highest estimated number (44,227) of vulnerable people to 
flooding. This is about 26 percent of the entire vulnerable population by LGA. 
Followed by Amuwo-Odofin and Shomolu LGAs with vulnerable population 
of 24,253 (14.1 percent) and 22,493 (13.0 percent) respectively. 

The final result of this analysis is the generation of a comprehensive flood 
risk geo-database system which contains both spatial and non-spatial data. In 
the database, a query system was built for quick and easy search of flood 
information for decision making processes. 
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Figure 14. Side-by-side comparison of the administrative areal extent and urbanized 
area. 


Table 5 provides basic planning information and this is a result of GIS- 
assisted query technique within the created geo-database. From this database 
system, it is possible to retrieve non-attribute information as it relates to its 
spatial constituent pertaining to a particular local administrative area and 
further narrowed down to specific vulnerability class. 

However, the various derivatives and spatial information generated from 
this analysis demonstrate the capability of GIS-based system supported by 
multi-scale remote sensing and temporal data for mapping and identifying vital 
components exposed to the impact of flooding. The modeling methodology 
proposed in this chapter is beneficial for both local and holistic flood risk 
modeling and offers a significant degree of accuracy and simplification in 
contrast to pure hydrological flood risk and prediction model which has been 
criticized for its complexity with respect to the nature of data, preparation, 
rigorousness of procedure and computational time (Sanyal and Lu, 2004; 
Taubenbock et al., 2011). 
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Table 5. Urbanized areas and population at risk by LGA 


Urba- | Extent of risk area by vulnerability class Population distribution 
Total area |nized (km?) P 
LGA extent area Very , Mode- Very Total Popu- Popu- 
(km*) extent hich High he Low ls popu- lation |lation at 
(km?) z lation“ density |risk 
Ajeromi/ 111 g2 {11.56 looo4 lo.o20 |0.030 0.041 |0.113. [856,869 174,103 [3.020 
Ifelodun 
Alimosho |150.38 120.11 {0.000 0.000 |0.000 |0.005 |0.038 |1,645,094 | 13,696 |69 
eee 176.72 [81.93 |3.090 |2.703 |2.277 |3.524 |4.845 |410,129 [5,006 |24,253 


Apapa 42.31 29.07 {2.089 1.951 1.735 |1.612 |2.881 |277,994 |9,563 |15,415 
Badagry {502.15 135.80 1.170 |1.076 |0.850 |1.736 |3.069 |296,378 |2,182 |3,788 


Epe 1,220.83 |47.93 |0.319 |0.265 0.255 0.371 [1.150 |226,566 |4,726 |1,754 
Eti-Osa |184.19 106.38 |3.066 |2.857 [2.521 |2.867 |4.966 |353,799 |3,325 |16,514 
Tbeju/ 


Lekki 467.36 |89.89 |0.069 0.070 |0.034 |0.373 |0.454 |146,851 {1,633 |609 


Ikeja 46.46 42.12 0.000 |0.000 |0.000 |0.359 |0.073 |395,966 |9,400 |3,373 
Ikorodu {375.32 {139.16 0.417 |0.435 |0.398 |0.400 |0.834 |658,148 |4,729 |1,895 
Kosofe — |75.28 38.55 {1.131 |1.113 [0.965 |2.003 |2.214 |851,204 |22,077 |44,227 


a 242.65 |25.33 |2.199 |1.917 |1.645 |1.805 |3.133 |265,171 |10,469 |18,896 
Mainland 23.13 (21.05 0.079 |0.093 10.098 0.112 (0.274 777,104 [36,923 |4,121 
Ojo 182.94 57.02 10.634 (0.484 0.403 10.902 [1.331 (759,449 [13,318 [12,017 
Shomolu [18.588 [17.654 |0.323 10.334 (0.348 [0.789 10.973 [503,124 |28,498 22,493 
Total 3,720.028 [855.554 [14.587 [13.318 | 11.559 16.899 26.348 18,423,846 172,444 


“Projected population figures to 2013 from the 2006 national census data. 


The fundamental strengths of this proposed method are simplification, 
lesser modeling time, wide range of application, rapid mapping of vital flood 
component and quick estimation of vulnerable population, infrastructures and 
environmental layout. In addition, the classification of vulnerability into five 
magnitudes offers a wide range of options for planners to explore and make 
multi-level decisions pertaining to flood hazard mitigation. 

Remarkably, the assessment of exposed population and settled areas by 
grass root level will facilitate coordination between the tiers of government 
regarding policy making and intervention. The prime limitation of this 
framework is finding a suitable method with higher level of accuracy to 
estimate endangered population especially in developing African cities where 
population data is aggregated on large scale administrative elements; and 
neighborhood-based demographic data is lacking. 

Conventional density estimation method which assumes the homogenous 
distribution of population over space often under-mine the fact that population 
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is majorly concentrated within urban space and other settled areas. To improve 
on this and possibly minimize this error, human footprints such as building 
activities and settlement structures which were extracted from temporal 
satellite data (and accurately checked with Google data) were used as a basis 
for calculating density. In other words, the primary human settled areas were 
included in the estimation. 

Notwithstanding, even within the human settled areas population 
concentration is not homogenous, so care should be taken when using the 
estimated vulnerable population information because some areas may be over- 
estimated and some underestimated. This however, depend on the expected 
level of information needed. 

On the one hand, if the purpose is to derive maximum degree of accuracy 
pertaining to the vulnerable population, (in most cases this is always a mirage 
because flood risk modeling is basically estimation and prediction), a 
comprehensive ground population count is recommended using the generated 
flood prone area map of the region. On the other hand, if the purpose is for 
substantial decision making and policy relevance, (mostly involving imminent 
and uncertain factors) it is most appropriate to know the correct pattern and 
dimension rather than the exact number of people (Taubenbock et al., 2011), 
thus this estimation method is preferred because it is based on up-to-date 
remote sensing data and statistical figures. 


CONCLUSION 


The most effective method of combating and mitigating the impact of 
flood disaster is to conduct a pre-risk assessment and identify vulnerable 
factors before a flooding episode. This is the case for more advanced countries 
of the world, but for developing countries especially those found in Africa 
region, the case is far from disaster preparedness. In these countries, with 
particular reference to Nigeria, public policies concerning flood management 
are oriented towards disaster alleviation and provision of aids. 

Public decision makers and other governmental framework in Nigeria find 
it more convenient to setup ad hoc disaster emergency management agencies 
with huge financial budget instead of concentrating manpower, technical skills 
and available financial resources on potential flood impact assessment and 
mitigation. The reason for this mono-perspective policy is generally lack of 
well synthesized planning information. 
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GIS technique is a reliable tool for spatial planning based on the level and 
quality of information generated. Its ability to integrate multi-scale and multi- 
temporal data from diverse sources makes the platform a powerful force for 
contemporary development and problem solving. 

The robustness of the GIS driven procedure prescribed in this chapter 
further demonstrates the quality and nature of its results. The series of maps 
produced and the database system generated are indispensable tools for 
decision makers, regarding flood management and risk analysis. 

The results are well simplified especially for policy formulators who do 
not have strong knowledge for interpreting spatial data. This is promoted by its 
geovisualization capability which provide quick visual impression of the 
pattern and degree of vulnerability. 

However, the results have a parallel link and application to public policies 
especially those related to urban and town planning in the study area. For 
example, evidence from this analysis point to the conclusion that urban 
planners need to revisit the water body setback policy. The flood mask 
generated through integrated procedures will be highly beneficial when the 
setback legislation is set for review. In addition, when decision makers decide 
to take a more drastic measure concerning the safety of the general public by 
relocating activities and residents from the floodplain and risk zones to risk 
free areas, the created database will serve as a guideline for such a project. 

The identified flood prone areas, infrastructures and population based on 
vulnerability class provide range of options and will undoubtedly assist in 
priority ranking. The integration of the flood risk data with Google map will 
help to explore building structures and track down residences (including their 
spatial location) that are exposed to a particular risk magnitude. Such spatially 
synchronized database would facilitate campaign and enlightenment program 
by presenting basic spatial information on flood risk and vulnerability. This 
can be done by collaborating with Google map makers to upload the flood 
mask data to the internet through their map server from where individuals can 
access it at any point and time. 
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ABSTRACT 


Detection of favorable zones for sustainable groundwater supply in 
terrains underlying by crystalline rocks needs to integrate two 
approaches: Remote Sensing and GIS. Landsat ETM+ images are 
processed by using ERDAS IMAGINE 9.2 and ASTER images by 
ArcGIS 9.3. Parameters controlling groundwater accumulation such as: 
rainfall, lineaments, lithology, slopes, and drainage network are evaluated 
in terms of 5 potential classes namely: very good potentials, good; 
moderate; low and very low potential and integrated in the GIS tool. The 
resulting map shows that 5% of the study area presents a very good 
groundwater potential and this part is mainly concentrated in the south 
part of the study area and on the MTZ, very low potentials constitute 13% 
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of the area located essentially in the north part of the study area and 
particularly on granites. Combination of results generated by the GIS 
with NDVI and the borehole productivities shows that, generally good 
groundwater potentials are well correlated with high vegetal activities in 
dry season except in areas affected by forest fires which are frequent in 
the region. High borehole productivities (11.5 -30m*/H) are observed in 
zone presenting high groundwater potential resulting from the GIS tool 
and very low borehole productivities (0.6-2.5m°/H) are shown in the 
north part of the study area corresponding to the low and very low 
groundwater potentials. Good, moderate and low groundwater potentials 
represent respectively 18, 33 and 30% of the total investigated surface 
area. This integrated approach combining Use of RS, GIS and 
hydrodynamic parameters such as borehole productivities can contribute 
to improve Knowledge of groundwater resources investigation in context 
of hard rock aquifers of the south-eastern of Senegal and can reduce the 
high rates of failed wells. So, RS and GIS can be used as an efficient 
tools for assessing groundwater potential at large scale. 


Keywords: Remote Sensing, GIS, groundwater, NDVI, Lineaments, 
crystalline rocks, Senegal 


I. INTRODUCTION 


In the sub-saharian regions, water storage issues have been increasingly 
difficult to tackle especially in regions where, the aquifers are located in 
crystalline rocks. The Sabodala region (Kédougou kéniéba Inlier in Senegal) 
like many parts of basement regions belongs to the field of Birimian basement 
rocks characterized by discontinuous aquifers. The hydrogeological system is 
constituted by weathered and fractured aquifers which are depending on 
tectonic structuration and climatic effects on rock formations. However, 
availability of groundwater resources is limited and therefore water issues in 
Sabodala constitute a great concern to supply population with fresh water due 
to the boreholes low yield and the high unsuccessful rates of drilling. 
Localization of favorable drilling zones for sustainable groundwater supply in 
crystalline rock terrain needs to integrate two approaches: Remote Sensing 
(RS) and geographic information systems (GIS). Many authors demonstrated 
the usefulness of the application of GIS and RS on natural resources 
management and monitoring. According to Ismail (2011), the use of satellite 
based RS has made it possible to map large areas with greater accuracy for 
various resources assessment and management. Teeuw (1995) proposed an 
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integrated approach of RS and GIS techniques to improve the site selection for 
borehole drilling in the Volta basin of northern Ghana. The application of GIS 
technology allows swift organization, quantification and interpretation of large 
quantities of hydrogeological data with more accuracy and minimal risk of 
human error (Pinder 2002). For Sisay (2007), Remote sensing technique 
provides an advantage of having access to large coverage, even in inaccessible 
areas. It is rapid and cost-effective tool in producing valuable data on geology, 
geomorphology, lineaments, slope, etc. that helps in deciphering groundwater 
potential zone. A systematic integration of these data with follow up of 
hydrogeological investigation provides rapid and cost effective delineation of 
groundwater potential zones. Despite the extensive research and technological 
advancement, the study of groundwater has remained more risky, as there is no 
direct method to facilitate observation of water below the surface. Its presence 
or absence can only be inferred indirectly by studying the geological and 
surface parameters. 

The Birimian formations which are the most representative formations are 
gold bearing formations like in most part of the West African craton. The 
geological formations are mainly made up of basalts, andesite, rhyodacites of 
gabbros, peridotites and volcanic-sedimentary rocks intruded by granites 
(Bassot, 1966). These formations, which have been repeatedly deformed and 
metamorphosed in low amphibolite facies and green schists facies by granite 
intrusions are often overlaid by a thick lateritic mantle, and this latter covers 
more than 2/3 of the geological strata which outcrops sometimes at the 
riverbeds. The weathering profiles show thickness as much as twenty meters 
with different horizons. Also the alluvium thickness can reach few meters at 
the riverbeds. A new cartography of Birmian formations (Théveniaut & al 
2010) identified that two groups and three suites are present in eastern 
Senegal, these are: 


e Volcano-sedimentary Groups of Mako and Dialé-Dalema; 
e Magmatic suites of Sandikounda-Soukouta, Saraya and Boboti. 


The hydrographic network comprises the Gambia River and Falémé River, 
both fed by Fouta Djalon in Guinea. The tributaries of these rivers are non- 
perennial and dry up during the dry season (Camus & Debuisson, 1964). The 
area is characterized by a dry season from October to April-May and a rainy 
season which usually starts at the end of April and continues until October 
with a maximum rainfall in August. The yearly average rainfall is about 1200 
mm per year at the Kédougou station (Mall. 2009). Hydrogeological context of 
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the study area is characterized by fractured, discontinuous and semi- 
continuous aquifers which are represented by the weathered fringe of hard 
rocks with yields varying from 0.6 to 30 m?/H and high unsuccessful drilling 
rates (Diouf, 1999). 
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Figure 1. Location map of the study area. 
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II. METHOD AND TOOLS 


Landsat ETM+ (2010) images were respectively processed using Erdas 
IMAGINE 9.2 and ASTER images, by Arc GIS 9.3. Landsat images were 
filtered for highlighting lineaments by low pass band (3x3 filter) and 
directional filtering (Sobel filter 7x7). The processing of ASTER images were 
carried out in Arc GIS 9.3 by spatial analyst tools with the Hydrology module 
in order to generate the drainage network and Slopes. So, lineament density 
and drainage density are carried out by using Line Density tool that calculates 
the density of linear features in the neighborhood of each output raster cell. 
Density is calculated in units of length per unit of area. All input rasters were 
generated, reclassified, weighted and overlaid using Weighted Overlay 
module. The Reclassification tools provide an effective way to compute the 
conversion. Each class value in an input raster is assigned a new value based 
on an evaluation scale. These new values were computed from the original 
input raster values. Each input raster is weighted according to its importance 
or its percent influence. The weight is a relative percentage, and the sum of the 
percent influence weights must equal 100. Changing the evaluation scales or 
the percentage influences can change the results of the weighted overlay 
analysis (Silverman,. 1986). The output data were combined in the model with 
parameters controlling groundwater accumulation such as rainfall, and 
lithology (Al Saud., 2010, Hossam., & al, 2011). These parameters were 
evaluated in terms of 5 potential classes namely: very good potentials 
potential, good; moderate; and low potential and very low and weighted from 
1 to 9 prior to integrate them into the GIS tools. All the thematic maps, such 
as: rainfall, geology, lineament density, geology, slope, drainage density, were 
converted to raster format followed by assigning respective theme weight and 
class rank as shown in (Table.1). Each raster map is reclassify into five classes 
of potentiality, ranging from very good potentials to very low potentials 
passing through to the moderate, good and low potentials. The weighted 
overlay analysis was performed “Spatial Analyst Module” of ArcGIS 9.3 
(Mehnaz, 2011), with integration of all the most influencing parameters 
controlling groundwater storage in the study area. NDVI (Normalized 
Difference Vegetation Index) from Landsat (ETM+, April 2010) was 
calculated from Landsat Bands NIR (ETM4+) and Red (ETM3+) and formula 
is given in (equation. 1). The NDVI is used to analyze Remote Sensing 
measurements to assess the presence of live green vegetation. In areas of a 
shallow water table, the presence of live green vegetation indicates the 
availability of groundwater during summer and hence NDVI is highly 
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important (National Remote Sensing Agency 2008). In this study, groundwater 
productivity in boreholes combined with NDVI are used as a validation 
method of the results given by the GIS tool. 


II.1. Rainfall 
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Figure 2. Isohyets distribution in the study area (MEPNBRLA, 2009). 
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Table 1. Classification of influencing factors on groundwater storage 


Class Very Good Good Moderate | Low Very low 

Weighted parameters |9 7 5 3 1 

Rainfall 30% 2,7 2,1 1,5 0,9 0,3 
1250-1150mm |1150-950mm 950-850mm | 850- 750-650mm 

750mm 

Lineaments 25% 2:25. 1,75 1,25 0,75 0,25 

Lenght Km/Km? 38-48 28-38 19-28 9,5-19 0-9,5 

Geology 20% 1,8 1,4 1 0,6 0,2 
Gabbros Volcano-clastic |Schiste Granite Basalte 

Slope 15% 1,35 1,05 0,75 0,45 0,15 
0-1% 1-2% 2-3% 3-5% 5-35% 

Drainage 10% 0,9 0,7 0,5 0,3 0,1 

Lenght Km/Km? 0-1 1-1,8 1,8-2,7 2,7-3,6 3,6-4,5 


It is the most important parameter in groundwater recharge. In the area, 
rainfall is concentrated mainly in rainy season which starts at the end of April 
and continues until the end of October in the south part of the study area. By 
cons in the northern part, rains start later in late June and usually stop at the 
beginning of October. Maximum precipitation is recorded in August in the two 
climatic provinces. The sector is characterized by a rainfall gradient change 
from south to north (Mall. 2009). This is shown by a spatial distribution of 
isohyets that increase from north to south with the 1250 mm isohyet observed 
in south of Kédougou and 650 mm isohyet in the north part of the area. This 
contrast between climatic provinces reflects a difference in the structure of 
vegetal communities. The southern part is the domain of forest with a high 
density of vegetation sometimes associated with gallery forests that follow the 
meandering of rivers. Nevertheless, going towards the north part of the study 
area, the forest gives way to savannah dominated by thorny vegetation well 
adapted to drought conditions. 


II.2. Geology 


Type of geological formation is an important factor in the development of 
water reserve in crystalline basement. Their petrography determines the type 
of weathering profile that develops on it. Thus, the thickness of the regolith is 
much higher on basic rocks than granites and schists as it develops more clay 
on basic rocks than on granites. However, the granites are much more resistant 
to weathering and mechanical disintegration than other rock types. So granites 
could become good aquifer if they are affected by faulting. By cons become 


324 I. Mall, M. Diaw, H. D. Madioune et al. 


less if they are unstructured because of the high granites resistivity to 
weathering due to high contents of quartz in their petrographic structures. 
Basic rocks and schists may have good potential aquifer due to their altered 
upper part often very well developed. This regolith especially has a capacitive 
function that plays an important role in deferred groundwater recharge and 
therefore may be a useful aquifer for hand dug wells. So, the most interesting 
potentials are recorded at ultra-basic rocks and in carbonate formations. 
Volcano-sedimentary can be a good aquifer with a good groundwater 
potential. However, schists present a moderate potential and low potential are 
found on granites and acid volcanic rocks. 


II.3. Lineaments 


They have been subject of several studies in the West Africa (Biémi, 
Engelec, Kouamé, Savadogo Savané). Their involvement in the research area 
conducive to the implementation of drilling is well established today. On 
satellite imagery, lineaments correspond to image discontinuities and are 
expressed by the juxtaposition or layering simple physiographic elements or 
composite varied natures (morphology, hydrography, vegetation, surface 
difference in tone ...) where, different parties are rectilinear relationship or 
slightly curvilinear (Kouamé, 1999). On the ground there are the lithological 
discontinuities (contact between different formations) or structural (fault, joint, 
dyke...) (GRONAYES et al.). Four major lineament directions are noted in the 
region. The N60-N50 directions (Fig. 5) are the most representative with 
respectively 10.8% and 10.5% of total lineament length followed by the N170- 
N180 and NO-N10 and finely the N100 direction that represents about 8% of 
the total lineament length. The preponderance of the N60 and N50 is 
essentially due to a regional tectonic accident called main transcurrent zone 
(MTZ) which represents a regional shear zone that affected the whole region. 


II.4. Slope 


The plain occupies most part of the area and the terrain is often marked by 
plateaus covered by a thick lateritic mantle. Slopes are higher especially in the 
southern west part of Kédougou in vicinity of Mako and in the center of the 
study area where they can reach 32% (Fig. 6). The north side of the study area 
land becomes relatively flat with few elevations and gentle slope near the 
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Falémé River. The elevations are more important in the south of the study 
area. This configuration of the geomorphology makes the terrains in the 
southern part, more conducive to runoff than infiltration, therefore less 
favorable to groundwater storage. 


Dolente 
HE Formations à dominante rhyolitique ou dacitique du Birnmien 
I) Formations 4 dominante grantique du Birrmien 


Pélites, sitites, grauwackes et formations volcano-sédimentaires du Birrimien| 
E Formations 4 dominante gréseuse ou quartzitique du Birrmien 
` al Formations magmatiques å dominante basique ou ultrabasique du Birnmien 
Formations carbonatées du Barimien 


Figure 3. Hydrogeological units map (Wuilleumier & al 2010). 
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Figure 4. Lineaments and lineament density map. 
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Figure 5. Rose diagramm of lineament orientations. 


II.5. Drainage Network 


Drainage network is a factor which depends essentially on watershed 
physiography including: its shape, size, slopes, geological formations etc. Like 
slope, the drainage network is inversely proportional to groundwater storage. 
An important drainage network means, a higher runoff that reduces 
groundwater storage capacity. In the study area, impermeable nature of the 
hard rock formations explains a very high density of the drainage network, 
which essentially corresponds to intermittent streams that dry up earlier as of 
January. However, if the topographical conditions are favorable (low slope) 
these gullies which accompany the riverbeds, can constitute substantial 
groundwater reserves with low lateral extension at the scale of villages. 


ETM4-ETM3 _ NIR-RED 
ETM4+ETM3 NIR+RED 


NDVI = (Eq. 1) 

Nearly all satellite Vegetation indices employ this difference to quantify 
the density of plant growth on the Earth: near infrared minus visible radiation 
divided by near-infrared radiation plus visible radiation. 
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Figure 6. Slope distribution in the study area. 
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Figure 7. Drainage network and density. 
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II.6. Organization Diagram of Integrated Parameters in the GIS 
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Figure 8. Integrated parameters (Rainfall, lineament density, Geology, Slope, Drainage 
density) and their percentage weights. 


III. RESULTS AND DISCUSSION 


Integration of the weighted parameters into a GIS platform infers a 
groundwater potential map with five potentiality classes in the study area 
ranking from very high potential groundwater zones to very low. It is noted 
that, high potential zone is located in the South of Sabodala mining region 
contrasting to the northern part where groundwater potential is low due to 
decrease in rainfall and type of lithology formations made up of granites 
essentially. The output map (Fig. 9) resulting from GIS combination of 
different parameters (rainfall, lineaments, geology, slope, drainage network) 
shows a map of five potential aquifer zones (Fig. 9) ranging from zones with 
very high potential to zones with very low potential. Thus about 5% of the 
investigated area has a very good potential. These zones are mostly found in 
the south of the area situated in the southern part of the Gambia River 
watershed between 1250 to 1150mm isohyets and on the main transcurrent 
zone (MTZ). However, zones presenting a very good potential are also noted 
in the sector located between isohyets 1150 to 850 mm, which are isolated 
zones and can be found on the south part of the Saraya’s granite (south and 
east of the Saraya city), south of Kanouméry village and its immediate 
surroundings. Very good potential is also localized in the volcano-sedimentary 
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formations (South Massamassa village) and Mouran village. In the North part 
of the study area only few sectors present very good potential are found at 
Kossanto and south of Makana village (south of Sabodala village). Very good 
and good groundwater potential zones are associated with intensive fracture 
system that provokes a high secondary permeability. Good potential zones 
occupy 18% of the study area and surround zones characterized by very good 
potential. Most of these zones is localized in south of the 850mm isohyet, 
however, some relics with good potential are noted in some sectors in the 
Northern part of the isohyet 850mm. The NDVI results, calculated from 
Landsat images April, 2010 (end of the dry season) (Fig.10), show that the 
very good and the good potential zones are well correlated with zones which 
show high vegetal activities. Very low potential zones and low potential zones 
correspond to zones where, vegetal activity is very low as noted in major part 
of granites (Sandikounda Soukouta and Saraya) and the northern part of the 
sector above the isohyet 850mm (Fig.10). However, this good correlation 
between the high intensity of vegetal activity and very good and good potential 
zone does not exist in the northern part of the study area in vicinity of Soreto 
village and this, could be related to xerophytic nature of vegetal communities 
which are adapted on drought conditions and can maintain their activities at a 
high level with a low soil moisture. This indicates that at this advanced period 
of dry season, vegetal activity is essentially supported by groundwater 
resources. 

So, the NDVI can be a good indicator of groundwater presence but, its use 
must require some caution because of deforestation and recurrence of fires 
bush in the area during advanced dry season. These phenomena can mask the 
intensity of vegetal activity in zones characterized by a low plants density. 
Also we must take into account the depth of water table. Low vegetal activity 
can be found in zones where, depth of water table exceeds 20m even though, 
the results of GIS indicate sometimes a very good aquifer potential in these 
zones, that is the case of the southern part of Mouran village. Borehole 
productivities are also taking into account and show that, high borehole 
productivities (about 11.5 to 30 m°/H) are observed in zone presenting high 
groundwater potential resulting from GIS processing and very low well 
productivities are shown in the north part of the study area corresponding to 
the low and very low groundwater potential with productivities ranging from 
0.6 to 2.5 m°/H. The results proved that, the north part of the study area 
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appears as a poor zone for groundwater storage if we take into account the 
results of GIS and the borehole productivities. 
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Figure 9. Groundwater potential map. 
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Figure 10. NDVI (Landsat image 04/2010). 
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CONCLUSION 


Integration of different parameters that influence on groundwater storage 
is helpful to find out the best place for drillings and thus to reduce the rates of 
unsuccessful wells. The resulting map shows that 5% of the study area 
presents Very high groundwater potential storage that is mainly concentrated 
in the south part of the study area and on the MTZ, very low potential 
constitutes 13% of the area is located essentially in the north part of the study 
area and particularly on granites. Combination of results generated by the GIS 
with NDVI and wells productivities shows that, generally good aquifer 
potentials are well correlated with high vegetal activities in advanced dry 
season except in areas affected by forest fires which are frequent in the region. 
High borehole productivities are observed in zone presenting high 
groundwater potential resulting from GIS application and very low borehole 
productivities are observed in the north part of the study area corresponding to 
the low and very low groundwater potential. These important results can be an 
efficient indicator that helps decision makers to better manage their drilling 
projects and minimize the high rates of failed boreholes. Therefore, use of RS 
and GIS contributes to improve Knowledge of groundwater resources 
investigation in context of hard rock aquifers of south-eastern of Senegal and 
can be used as an efficient tool for assessing groundwater potential at large 
scale. 
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